https://kotlinlang.org logo
Join Slack
Powered by
# datascience
  • z

    zaleslaw

    02/19/2025, 2:51 PM
    🌍 Kandy 0.8: Unlocking Geospatial Visualization! 🗺️ The latest Kandy 0.8 update brings powerful geospatial plotting capabilities! Effortlessly work with spatial data using GeoDataFrame and seamlessly visualize it with Kandy’s geo extensions. Explore the geo plotting guide and dive into the gallery of geo charts to see what’s possible!
    🌎 4
    ❤️ 4
    🎉 6
    📈 4
    K 2
  • j

    Jens

    03/03/2025, 5:51 PM
    Hey everybody, I'm trying to read an Excel file with kotlinx.dataframe but get the exception
    Copy code
    [2025-03-03T17:49:09.967Z] Caused by: java.io.IOException: Zip bomb detected! The file would exceed the max. ratio of compressed file size to the size of the expanded data.
    [2025-03-03T17:49:09.967Z] This may indicate that the file is used to inflate memory usage and thus could pose a security risk.
    [2025-03-03T17:49:09.967Z] You can adjust this limit via ZipSecureFile.setMinInflateRatio() if you need to work with files which exceed this limit.
    [2025-03-03T17:49:09.967Z] Uncompressed size: 847128, Raw/compressed size: 8468, ratio: 0.009996
    I have the following deps in my build.gradle.kts, but the class ZipSecureFile can't be found. Is there another way, I can set the min inflate ratio?
    Copy code
    implementation("org.jetbrains.kotlinx:dataframe:0.15.0")
        implementation("org.jetbrains.kotlinx:dataframe-excel:0.15.0")
  • j

    Jens

    03/03/2025, 5:54 PM
    Copy code
    %use dataframe
    Seems to add this class to the classpath and it can be imported. What's the corresponding dependency, when working with dataframe in headless production code?
  • j

    Jens

    03/03/2025, 6:00 PM
    Ah, ok. Importing poi fixed it
    Copy code
    implementation ("org.apache.poi:poi-ooxml:5.2.3")
    j
    • 2
    • 3
  • p

    Paulina Sobieszuk

    03/06/2025, 12:34 PM
    Hey Kotlin DataFrame users! The Kotlin team wants to learn more about what you use Dataframe for. Please vote by reacting to this post:
    What do you use DataFrame for?
    1️⃣ Generating reports & dashboards
    2️⃣ Preparing data for ML/AI models
    3️⃣ Data cleaning & enrichment
    4️⃣ Working with REST APIs, files, and SQL databases
    5️⃣ Processing data in business logic applications
    6️⃣ Something else (tell us in the comments)
    Thanks a lot for your help!
    3️⃣ 10
    4️⃣ 5
    5️⃣ 6
    6️⃣ 1
    2️⃣ 7
    1️⃣ 9
  • z

    zaleslaw

    03/07/2025, 2:44 PM
    alphabet white question Hi, dear community, please participate in the poll above ☝️ , it‘s important for us K
    👍 1
    a
    • 2
    • 1
  • e

    esionecneics

    03/12/2025, 1:46 PM
    I really wish there was a machine learning library for Kotlin. Like Sklearn for Python. At the moment I always fall back on the Java library Smile, but its API is unfortunately so impractical and inconsistent and it's so hard to prepare the data so that it can be fed into an SVM or Gradient Boosting Classifier. An ML library that harmonizes with Dataframe and Kandy would be a logical next step. It's not about competing with Python and its data science ecosystem. It's about Kotlin developers not having to switch back and forth between Kotlin and Python all the time. I know the Jetbrains team can do it 💛
    👌 1
    c
    z
    • 3
    • 3
  • p

    Paulina Sobieszuk

    03/13/2025, 2:34 PM
    Hello, Kotlin DataFrame users! K If you currently use Kotlin DataFrame or have used it in the past, we’d love to hear from you! We’re conducting 60-minute interviews about Kotlin DataFrame use cases to learn what’s working well and identify areas for improvement. The sessions will take place via Google Meet. As a thank-you for your time, you can choose from one of the following rewards: • A USD 100 Amazon Gift Card, or • A one-year subscription to JetBrains All Products Pack. To participate, please complete a short questionnaire. If your profile matches our study criteria, you'll be redirected to our Calendly page to schedule your session. Kotlin Product Research Team 🙌
    s
    • 2
    • 3
  • m

    Marian Schubert

    03/27/2025, 1:51 PM
    I'm getting following error in a Kotlin notebook after we updated project to Kotlin 2.1.20 (from 2.0.21)
    Copy code
    Class '...' was compiled with an incompatible version of Kotlin. The actual metadata version is 2.1.0, but the compiler version 1.9.0 can read versions up to 2.0.0.
    It seems that Kotlin notebook plugin is using Kotlin 1.9 compiler? Is there any way to fix that problem?
    a
    • 2
    • 2
  • e

    eenriquelopez

    04/08/2025, 2:13 PM
    I came across a paper discussing an experiment and tried to reproduce it. Here’s a brief summary: • Portfolio A: In a bull market, grows by 20%; in a bear market, drops by 20%. • Portfolio B: In a bull market, grows by 25%; in a bear market, drops by 35%. • Bull market probability: 75%. According to the paper, both portfolios should have a one year expected return of 10%. However, the paper claims that Portfolio A wins over Portfolio B around 90% of the time. After running a Monte Carlo simulation (code attached), I found that Portfolio A outperforms Portfolio B around 66% of the time. Question: Am I doing something wrong in my simulation, or is the assumption in the original paper incorrect?
    Copy code
    // Simulation parameters
    val years = 30
    val simulations = 10000
    val initialInvestment = 1.0
    
    // Market probabilities (adjusting bear probability to 30% and bull to 70%)
    val bullProb = 0.75 // 75% for Bull markets
    
    // Portfolio returns
    val portfolioA = mapOf("bull" to 1.20, "bear" to 0.80)
    val portfolioB = mapOf("bull" to 1.25, "bear" to 0.65)
    
    // Function to simulate one portfolio run and return the accumulated return for each year
    fun simulatePortfolioAccumulatedReturns(returns: Map<String, Double>, rng: Random): List<Double> {
        var value = initialInvestment
        val accumulatedReturns = mutableListOf<Double>()
        
        repeat(years) {
            val isBull = rng.nextDouble() < bullProb
            val market = if (isBull) "bull" else "bear"
            value *= returns[market]!!
    
            // Calculate accumulated return for the current year
            val accumulatedReturn = (value - initialInvestment) / initialInvestment * 100
            accumulatedReturns.add(accumulatedReturn)
        }
        return accumulatedReturns
    }
    
    // Running simulations and storing accumulated returns for each year (for each portfolio)
    val rng = Random(System.currentTimeMillis())
    
    val accumulatedResults = (1..simulations).map {
        val accumulatedReturnsA = simulatePortfolioAccumulatedReturns(portfolioA, rng)
        val accumulatedReturnsB = simulatePortfolioAccumulatedReturns(portfolioB, rng)
        
        mapOf("Simulation" to it, "PortfolioA" to accumulatedReturnsA, "PortfolioB" to accumulatedReturnsB)
    }
    
    // Count the number of simulations where Portfolio A outperforms Portfolio B and vice versa
    var portfolioAOutperformsB = 0
    var portfolioBOutperformsA = 0
    accumulatedResults.forEach { result ->
        val accumulatedA = result["PortfolioA"] as List<Double>
        val accumulatedB = result["PortfolioB"] as List<Double>
    
        if (accumulatedA.last() > accumulatedB.last()) {
            portfolioAOutperformsB++
        } else {
            portfolioBOutperformsA++
        }
    }
    
    // Print the results
    println("Number of simulations where Portfolio A outperforms Portfolio B: $portfolioAOutperformsB")
    println("Number of simulations where Portfolio B outperforms Portfolio A: $portfolioBOutperformsA")
    println("Portfolio A outperformed Portfolio B in ${portfolioAOutperformsB.toDouble() / simulations * 100}% of simulations.")
    println("Portfolio B outperformed Portfolio A in ${portfolioBOutperformsA.toDouble() / simulations * 100}% of simulations.")
    👍 2
    a
    a
    • 3
    • 26
  • p

    Paulo Cereda

    04/11/2025, 5:27 PM
    Hi friends! DataFrame question. 🙂 I have a
    .csv
    file in which one the columns has a comma-separated string. I would like to split it and have the line replicated for each element. I have a working code, but it's far from optimal. Code in thread. 🧵
    r
    • 2
    • 7
  • j

    Jason Zhao

    04/14/2025, 6:57 AM
    Hi, I have a question about the Kotlin notebook. Is it not possible to use
    kotlinx.serialization
    in the kotlin notebook? I have a notebook that needs to load my application transiently to perform a task. The application loading fails with the following error when it gets to the config deserialization part. I initially thought it was an issue with version conflict with dependencies, but I excluded the dependencies in question from loading Kotlin Serialization and it is still occurring. Note that when I run the same application logic in a regular Kotlin main function, I don't get this error.
    java.lang.AbstractMethodError: Receiver class ((my class name))$$serializer does not define or inherit an implementation of the resolved method 'abstract kotlinx.serialization.KSerializer[] typeParametersSerializers()' of interface kotlinx.serialization.internal.GeneratedSerializer.
    Note: I am in K2 mode so the issue might be related to that. (On a side note variables are still not detected across cells in the K2 mode.)
    i
    t
    • 3
    • 3
  • u

    Ume Channel

    04/29/2025, 11:13 PM
    Good day! Hello how to fix when the number is overflow - like 1,927,384,739 becomes 1.927384739E9 - I don't want it to be expressed as E Notation
    pjTDYzyiTmEte21D_oIlIY6HL8t_5FH7f-Global Library Data.xlsx
    a
    • 2
    • 8
  • u

    Ume Channel

    04/30/2025, 12:55 AM
    how to include decimal in number when importing xlxs
    a
    • 2
    • 8
  • u

    Ume Channel

    04/30/2025, 10:35 PM
    Temporary solution for overflow Number E Notation: Map it to data class then transform it to a String, transform to Double or null (if null make it 0.0) then make it Big Decimal then recreate the table for recreation - process(on studying)
  • u

    Ume Channel

    05/01/2025, 6:21 PM
    Hello, when i don't have a column names like in the first row names like - country, region, expenditures, total libraries, etc. and my first row is directly values - like Albania, Europe, 132312, 12312, etc. and I query about first this happens - see image There should be a column names first(title) before the values?
    a
    • 2
    • 5
  • k

    Karthick

    05/08/2025, 8:58 AM
    Anybody knows that can we import a variable or functions from one notebook into another notebook?
    a
    j
    • 3
    • 2
  • p

    Paulo Cereda

    05/30/2025, 3:49 PM
    ⚠️ Hic sunt leones Occasionally I have to bring data to DataFrame from an unsupported database, so I wrote the following code as "an exercise to the reader":
    Copy code
    fun ResultSet.asSequence(): Sequence<ResultSet> = sequence {
        while (next()) {
            yield(this@asSequence)
        }
    }
    
    operator fun ResultSet.get(index: Int): Any? = this.getObject(index)
    
    fun ResultSet.toDataFrame(): DataFrame<*> =
        mutableMapOf<String, MutableList<Any?>>()
            .let { map ->
                val names = List(metaData.columnCount) {
                    metaData.getColumnName(it + 1)
                }
                this
                    .asSequence()
                    .forEach { row ->
                        names.mapIndexed { index, name ->
                            map[name]?.add(row[index + 1]) ?: map.put(name, mutableListOf(row[index + 1]))
                        }
                    }
                map
            }.toDataFrame()
    Hope someone can make use of this (even as a counterexample). 😁 Cheerio!
    🦁 2
    j
    • 2
    • 2
  • e

    eenriquelopez

    06/18/2025, 9:52 AM
    I have two separate cells in a Notebook: one that does some processing, and returns me a dataframe that contain some coordinates with some info. Another one that takes the GeoPolygon for a region. This is what I can draw: As you may have imagined, I want to combine them both. However, if I read the documentation of Geo Plotting properly, the only way to combine them is to transform the first dataframe into a GeoJSON object (several GeoPoints probably) and then overlap it into the map. Is there any other way that this can be done without having to convert the objects? I was hoping that the geomap() function could offer somehow a lambda that could be used to return points.
    a
    • 2
    • 3
  • e

    eenriquelopez

    06/18/2025, 9:53 AM
    (code here, if you are interested)
  • e

    eenriquelopez

    06/18/2025, 10:53 AM
    Also, partially related: is there any library or framework like Quarto in R that would allow you to publish a report?
    a
    n
    d
    • 4
    • 8
  • u

    Ume Channel

    07/15/2025, 10:32 AM
    does can data analysis library specifically kandy generate visual data analysis image without using kotlin notebook? I want to plan to store kotlin code in text then run it in kotlin environment.
    a
    • 2
    • 2
  • d

    Dumitru Preguza

    09/17/2025, 8:54 AM
    Is there a way to do HDBSCAN clustering using kotlin tools ? https://scikit-learn.org/stable/modules/clustering.html#clustering
    y
    a
    • 3
    • 11
  • m

    Marian Schubert

    09/17/2025, 3:47 PM
    Copy code
    val df = dataFrameOf("id", "props")(
        1, mapOf("a" to 10, "b" to 20),
        2, mapOf("a" to 30, "b" to 40),
    )
    is it possible to convert this to dataframe with columns
    id, a, b
    ?
    👀 1
    j
    • 2
    • 5
  • j

    Jolan Rensen [JB]

    09/18/2025, 12:34 PM
    📢 Kotlin DataFrame 1.0.0-Beta3 is out — bringing us one step closer to 1.0! This update brings Parquet and DuckDB support, better compile-time schema tracking via the Compiler Plugin, and a big refresh of docs and examples — including examples of using DataFrame together with Android, Apache Spark, Hibernate, Exposed, and more. Here are the highlights: ✅ NEW: Read data directly from Parquet files ✅ NEW: Read data from DuckDB databases ✅ Docs: Major updates across the board — setup guides, plugin usage, data sources, FAQs, and more ✅ Examples of usage: Android, Apache Spark + Parquet, Hibernate, Exposed & more ✅ Improvements to format to HTML ✅ Compiler plugin improvements for better schema tracking at compile time Release notes: https://github.com/Kotlin/dataframe/releases/tag/v1.0.0-Beta3 📚 Learn more: https://kotlin.github.io/dataframe/ 🌟 Examples: https://github.com/Kotlin/dataframe/tree/master/examples Slack Conversation
    🙌 7
    🔥 4
    🚀 4
  • r

    rnett

    09/25/2025, 8:43 PM
    Hey folks, I'm using Kandy for some plots, and the
    histogram
    and
    statBin
    functions mentioned in the docs do not seem to exist in the 0.8.0 version. I'm using
    org.jetbrains.kotlinx:kandy-lets-plot:0.8.0
    , what am I missing? I'm not using DataFrame, just trying to do something like this
    a
    • 2
    • 3
  • z

    zaleslaw

    10/13/2025, 3:15 PM
    Hi everyone! A couple of weeks ago we released DataFrame Beta-3, and we’re now looking for people who have already tried it — including those who used it together with the *Compiler Plugin* for automatic schema generation in Gradle projects (works with Kotlin 2.2.0+). Please press an emoji below to let us know: • ✅ Tried Beta-3 with the Compiler Plugin • K Tried Beta-3 without the Compiler Plugin • 👀 Haven’t tried it yet, but know it’s out • ❓ Didn’t know there’s a new Beta with the Compiler Plugin
    K 2
    👀 2
    ✅ 1
    ❓ 9
  • c

    corneil

    10/15/2025, 8:37 AM
    I'm using geodataframe to plot a map of North America the geojson does have state and province names. I would like to display the names of the states on the map. How do I do that with geoMap or geoPolygon? I have tried inside and outside of geoMap:
    Copy code
    geoMap {
        text {
            label(stateProvinceName)
            alpha = 1.0
            x(centerX)
            y(centerY)
            font.color = Color.BLACK
            font.size = 12.0
            font.family = FontFamily.SANS
        }
    }
    The map areas are different colours but black should be very visible. I also tried a range of sizes from 0.1 to 16.0
    stateProvinceName
    is a column accessor and the relevant column does show up in save html or json but no label is rendered.
    a
    • 2
    • 5
  • a

    andyg

    10/24/2025, 3:35 AM
    After a little time away I started a new Kotlin DataFrame (Beta3) project inside the IDEA built-in Kotlin Notebook - any memory management tips? My data set isn't too large (16 million rows, 8 columns, just short strings, LocalDates, Ints, Longs) but even after doubling the default memory size to 6.5GB I consistently got heap and out of memory errors. Tried loading from database and from Parquet file (file = 180mb). I was only able to get the original dataframe created after doubling RAM but doing a simple groupBy/aggregation threw memory errors again (even though this operation would reduce the row count by 1/7). I'm on a Mac with 36gb RAM and can see "java" process taking up high 6 or low 7 gb RAM when the data is first read, before the aggregation step. Loading the same Parquet file in Python is low 2gb. I would like to try using the Kotlin Jupyter kernel in VSCode without the overhead of IDEA or Kotlin Notebook but that will have to wait a few days as I'm running on an older version of the Jupyter kernel. I have used Kotlin DataFrame very successfully for some projects with smaller datasets so obviously would like to continue doing so. Thanks!!
    a
    d
    n
    • 4
    • 8
  • c

    corneil

    10/25/2025, 10:21 AM
    Does anyone have an idea why this happens:
    Copy code
    java.lang.NullPointerException: null
    	at org.jetbrains.letsPlot.core.plot.base.aes.AesScaling.strokeWidth(AesScaling.kt:18)
    	at org.jetbrains.letsPlot.core.plot.base.geom.util.LinesHelper$decorate$1.invoke(LinesHelper.kt:228)
    	at org.jetbrains.letsPlot.core.plot.base.geom.util.LinesHelper$decorate$1.invoke(LinesHelper.kt:228)
    	at org.jetbrains.letsPlot.core.plot.base.geom.util.LinesHelper.decorate(LinesHelper.kt:241)
    	at org.jetbrains.letsPlot.core.plot.base.geom.util.LinesHelper.decorate$default(LinesHelper.kt:224)
    	at org.jetbrains.letsPlot.core.plot.base.geom.util.LinesHelper.createPolygon(LinesHelper.kt:112)
    	at org.jetbrains.letsPlot.core.plot.base.geom.PolygonGeom.buildIntern(PolygonGeom.kt:33)
    	at org.jetbrains.letsPlot.core.plot.base.geom.GeomBase.build(GeomBase.kt:33)
    a
    • 2
    • 2