https://kotlinlang.org logo
Join SlackCommunities
Powered by
# datascience
  • l

    Lubomir Pisk

    10/29/2024, 8:09 PM
    Hi everybody I would like to use DataFrame: 0.14.1, Kotlin: 2.0.0 with following configuration -> Gradle tasks build, assemble, generateDataFrames don't generate extension properties. Could you help me where the problem is?
    Copy code
    plugins {
        alias(libs.plugins.kotlin.multiplatform)
        alias(libs.plugins.kotlin.serialization)
        alias(libs.plugins.kotlinx.dataframe)
    }
    
    kotlin {
        jvm {
            compilerOptions {
                jvmTarget.set(JvmTarget.JVM_17)
            }
        }
    
        sourceSets {
            jvmMain {
                dependencies {
                    implementation(project(":core:config"))
                    implementation(project(":core:datetime"))
                    implementation(project(":core:di"))
                    implementation(project(":core:format"))
                    implementation(project(":core:graph"))
    
                    implementation(libs.koin.core)
                    implementation(libs.kotlinx.coroutines)
                    implementation(libs.kotlinx.dataframe)
                    implementation(libs.kotlinx.datetime)
                }
            }
        }
    } 
    
    kotlinx-dataframe = { id = "org.jetbrains.kotlinx.dataframe", version.ref = "dataframe" }
    kotlinx-dataframe = { group = "org.jetbrains.kotlinx", name = "dataframe", version.ref = "dataframe" }
    j
    • 2
    • 3
  • h

    holgerbrandl

    11/08/2024, 5:08 AM
    My team and I noticed that dataframe-core 14.1 now depends on org.slf4jslf4j simple2.0.16 as runtime dependency . We'd consider this a bug, because as a library dataframe-core should not be opinionated about the logging provider being used. Using sl4j as logging facade is good practise (via io.github.oshaikotlin logging7.0.0 and org.slf4jslf4j api2.0.16 ). By adding simple on library level, it makes it harder to consume dataframe in projects that prefer e.g. logback als logging provider. It seems a regression because older versions of dataframe-core did not do so. Could the dependency be removed (or reduced back to slf4j-api)?
    👍 2
    sad panda 1
    z
    j
    • 3
    • 3
  • s

    Sinan Gunes

    11/19/2024, 2:25 PM
    Hello everyone. I have a question about the
    dataFrame
    project. I would like to use custom enums, as property read from CSV. Is there a way to do it? Thanks 👍
    n
    • 2
    • 2
  • h

    holgerbrandl

    11/27/2024, 3:01 PM
    Hi there, what is the scheduled release date of dataframe v0.15? I'm eager to enjoy in particular https://github.com/Kotlin/dataframe/pull/931 :-)
    😁 2
    n
    j
    • 3
    • 3
  • z

    zaleslaw

    12/11/2024, 10:08 AM
    🚀 Kotlin DataFrame v0.15.0 Release Announcement 🚀 We’re thrilled to announce the release of *Kotlin DataFrame v0.15.0*, packed with powerful new features, performance improvements, and exciting experimental integrations! Key Highlights 1. Experimental CSV Parser ◦ Introducing a new CSV parser based on Deephaven-CSV for faster and more reliable data parsing. 2. GeoDataFrame Class ◦ Work with geographical data in GeoJson or Shapefile formats and visualize it using Kandy. 3. Full BigInteger Support ◦ Enhanced support for BigInteger, enabling parsing, conversions, statistics, and column arithmetic. 4. Custom SQL Database Registration ◦ Register custom SQL databases effortlessly—check the user guide for details. 5. Improved Parsing ◦ Faster and more flexible parsing of String columns. ◦ New ParserOptions.useFastDoubleParser setting for improved Double parsing performance. 6. *Compiler Plugin Improvements (*check the actual demo here) Explore the Features Check out the resources below to dive into the new functionality: • New Features Example Notebook • How to Extend DataFrame Library for Custom SQL Database Support: Example with HSQLDB We can't wait to see what you'll build with Kotlin DataFrame v0.15.0!
    🔥 9
    ❤️ 8
    K 10
  • z

    zaleslaw

    12/17/2024, 11:48 AM
    Hello everyone! We’re excited to share the pre-release of Kandy version 0.8.0, which introduces geo-plotting capabilities! This is the very first, experimental version of geo-plotting, and while we are still polishing the final release, we invite you to explore the new features using
    0.8.0-RC1
    . The detailed documentation and user guide are on their way, but for now, you can refer to the attached notebook, which includes examples, use cases, and demonstrations of working with geospatial data. We’d love to hear your feedback and impressions as you try out this new functionality! https://gist.github.com/AndreiKingsley/5aa25acbfa52aadbb6a3e3c641c54d57
    🗺️ 8
    🔥 9
    📈 6
    kotlin notebook 9
    😃 4
    🎉 3
  • s

    shaktiman_droid

    12/20/2024, 9:56 PM
    Are there any data science/machine learning/onnx related libraries that are Kotlin Multiplatform with the Kotlin/native support for iOS implementation.
    a
    m
    • 3
    • 6
  • a

    altavir

    12/25/2024, 6:03 AM
    @Adrian Trapletti We've started a commercial project that uses Clarabel4J as one of its solvers. So thank you very much both for it and for your great tutorials. The project itself is closed source, but I think that I will be able to share some code of matrix preparations in KMath examples in future. Today I made some JMH benchmarks comparing Clarabel solver and OJalgo solver on the same problem. I remember You said that Clarabel should be faster, but I get 1349 ops/second on Ojalgo versus 400 ops/second on Clarabel4J. It is possible that I messed matrix preparations (I did not optimize them), but they all are linear so I do not think it should matter.
    a
    • 2
    • 4
  • j

    Jens

    01/09/2025, 9:50 AM
    I have a list of lists of strings, basically just an excel sheet, of roughly 60-90MB size in RAM. This data needs to be exported as excel and am thinking about tinkering a little more with Kotlin Dataframe to do it. What would you say: Use Dataframe or just plain old apache POI?
    e
    j
    n
    • 4
    • 16
  • e

    esionecneics

    01/09/2025, 11:50 PM
    Can I use Kandy also completely and flawless in a normal kt file? I have this function, which works fine in a Kotlin notebook, but NOT in a kt file:
    Copy code
    import org.jetbrains.kotlinx.dataframe.DataFrame
    import org.jetbrains.kotlinx.kandy.dsl.plot
    import org.jetbrains.kotlinx.kandy.letsplot.layers.line
    import org.jetbrains.kotlinx.kandy.util.color.Color
    
    
    fun visualizeCustomerNumber(df: DataFrame<*>) {
        df.plot {
            line {
                x("createdAt") {
                    axis {
                        name = "Date"
                    }
                }
                y("customerNumber") {
                    axis {
                        name = "Customer Number"
                    }
                }
                color = Color.RED
            }
            layout {
                title = "Tomorrow Customer Number"
            }
        }
    }
    The keywords: |axis, name, layout, title| are recognized by IntelliJ The import in the notebook on the other hand works perfectly with: %use kandy This is the version I use in my build.gradle.kts: dependencies { implementation("org.jetbrains.kotlinxkandy lets plot0.8.0-RC1") }
    a
    • 2
    • 5
  • d

    Dumitru Preguza

    01/22/2025, 4:14 PM
    Hi everyone, I'm using the experimental Notebook Spring Boot starter, so far it's fine, but there are some issues to resolve, the main one is transforming Hibernate entities to DataFrame, because of lazy init. I end up having this exception:
    Copy code
    listOf(jobRepository.getJobById(12345)).toDataFrame()
    Copy code
    The problem is found in one of the loaded libraries: check library renderers
    org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: data.entity.JobDefinition.jobRuns: could not initialize proxy - no Session
    org.jetbrains.kotlinx.jupyter.exceptions.ReplLibraryException: The problem is found in one of the loaded libraries: check library renderers
    	at ...
    r
    • 2
    • 4
  • d

    Dumitru Preguza

    01/22/2025, 4:39 PM
    How to enable the "multi dollar interpolation" feature ? Sometimes I work with mongodb queries, and I need the dollar sign $ inside a string without interpolation:
    Copy code
    $$"test $test"
    $$"""test $test"""
    Exception/hint appears: The feature "multi dollar interpolation" is experimental and should be enabled explicitly
    r
    • 2
    • 1
  • p

    patrickdelconte

    02/14/2025, 7:05 PM
    I am trying to create a plot that has a line on top of a bar chart, like in the image, with a separate y axis for the bar and line charts. I tried a few different combinations of
    scale
    and
    axis
    inside
    y()
    without any luck, none of it worked. It seems like only the last usage of
    scale
    makes it into the plot.
    Copy code
    val wounds = listOf(4520, 3242, 3128, 3156, 4115, 5082, 5918, 3811, 5013, 6426, 5952, 5761, 5685, 5316, 4726, 4127, 4121, 3837, 3232, 2684, 2151, 1904, 1528, 1182, 971, 679, 564, 367, 276, 194, 111, 87, 59, 42, 19, 11, 1, 2)
    val betterPercent = wounds.indices.map { x -> (wounds.filterIndexed { index, _ ->  index > x }.sumOf { it } / wounds.sum().toDouble() * 100.0).roundToInt() }
    
    plot() {
        x(wounds.indices)
        bars() {
            y(wounds)
        }
        line {
            y(betterPercent){
                scale = continuous(0..100)
            }
        }
    }
    a
    • 2
    • 3
  • z

    zaleslaw

    02/19/2025, 2:51 PM
    🌍 Kandy 0.8: Unlocking Geospatial Visualization! 🗺️ The latest Kandy 0.8 update brings powerful geospatial plotting capabilities! Effortlessly work with spatial data using GeoDataFrame and seamlessly visualize it with Kandy’s geo extensions. Explore the geo plotting guide and dive into the gallery of geo charts to see what’s possible!
    🌎 4
    ❤️ 4
    🎉 6
    📈 4
    K 2
  • j

    Jens

    03/03/2025, 5:51 PM
    Hey everybody, I'm trying to read an Excel file with kotlinx.dataframe but get the exception
    Copy code
    [2025-03-03T17:49:09.967Z] Caused by: java.io.IOException: Zip bomb detected! The file would exceed the max. ratio of compressed file size to the size of the expanded data.
    [2025-03-03T17:49:09.967Z] This may indicate that the file is used to inflate memory usage and thus could pose a security risk.
    [2025-03-03T17:49:09.967Z] You can adjust this limit via ZipSecureFile.setMinInflateRatio() if you need to work with files which exceed this limit.
    [2025-03-03T17:49:09.967Z] Uncompressed size: 847128, Raw/compressed size: 8468, ratio: 0.009996
    I have the following deps in my build.gradle.kts, but the class ZipSecureFile can't be found. Is there another way, I can set the min inflate ratio?
    Copy code
    implementation("org.jetbrains.kotlinx:dataframe:0.15.0")
        implementation("org.jetbrains.kotlinx:dataframe-excel:0.15.0")
  • j

    Jens

    03/03/2025, 5:54 PM
    Copy code
    %use dataframe
    Seems to add this class to the classpath and it can be imported. What's the corresponding dependency, when working with dataframe in headless production code?
  • j

    Jens

    03/03/2025, 6:00 PM
    Ah, ok. Importing poi fixed it
    Copy code
    implementation ("org.apache.poi:poi-ooxml:5.2.3")
    j
    • 2
    • 3
  • p

    Paulina Sobieszuk

    03/06/2025, 12:34 PM
    Hey Kotlin DataFrame users! The Kotlin team wants to learn more about what you use Dataframe for. Please vote by reacting to this post:
    What do you use DataFrame for?
    1️⃣ Generating reports & dashboards
    2️⃣ Preparing data for ML/AI models
    3️⃣ Data cleaning & enrichment
    4️⃣ Working with REST APIs, files, and SQL databases
    5️⃣ Processing data in business logic applications
    6️⃣ Something else (tell us in the comments)
    Thanks a lot for your help!
    3️⃣ 10
    4️⃣ 5
    5️⃣ 6
    6️⃣ 1
    2️⃣ 7
    1️⃣ 8
  • z

    zaleslaw

    03/07/2025, 2:44 PM
    alphabet white question Hi, dear community, please participate in the poll above ☝️ , it‘s important for us K
    👍 1
    a
    • 2
    • 1
  • e

    esionecneics

    03/12/2025, 1:46 PM
    I really wish there was a machine learning library for Kotlin. Like Sklearn for Python. At the moment I always fall back on the Java library Smile, but its API is unfortunately so impractical and inconsistent and it's so hard to prepare the data so that it can be fed into an SVM or Gradient Boosting Classifier. An ML library that harmonizes with Dataframe and Kandy would be a logical next step. It's not about competing with Python and its data science ecosystem. It's about Kotlin developers not having to switch back and forth between Kotlin and Python all the time. I know the Jetbrains team can do it 💛
    👌 1
    c
    z
    • 3
    • 3
  • p

    Paulina Sobieszuk

    03/13/2025, 2:34 PM
    Hello, Kotlin DataFrame users! K If you currently use Kotlin DataFrame or have used it in the past, we’d love to hear from you! We’re conducting 60-minute interviews about Kotlin DataFrame use cases to learn what’s working well and identify areas for improvement. The sessions will take place via Google Meet. As a thank-you for your time, you can choose from one of the following rewards: • A USD 100 Amazon Gift Card, or • A one-year subscription to JetBrains All Products Pack. To participate, please complete a short questionnaire. If your profile matches our study criteria, you'll be redirected to our Calendly page to schedule your session. Kotlin Product Research Team 🙌
    s
    • 2
    • 3
  • m

    Marian Schubert

    03/27/2025, 1:51 PM
    I'm getting following error in a Kotlin notebook after we updated project to Kotlin 2.1.20 (from 2.0.21)
    Copy code
    Class '...' was compiled with an incompatible version of Kotlin. The actual metadata version is 2.1.0, but the compiler version 1.9.0 can read versions up to 2.0.0.
    It seems that Kotlin notebook plugin is using Kotlin 1.9 compiler? Is there any way to fix that problem?
    a
    • 2
    • 2
  • e

    eenriquelopez

    04/08/2025, 2:13 PM
    I came across a paper discussing an experiment and tried to reproduce it. Here’s a brief summary: • Portfolio A: In a bull market, grows by 20%; in a bear market, drops by 20%. • Portfolio B: In a bull market, grows by 25%; in a bear market, drops by 35%. • Bull market probability: 75%. According to the paper, both portfolios should have a one year expected return of 10%. However, the paper claims that Portfolio A wins over Portfolio B around 90% of the time. After running a Monte Carlo simulation (code attached), I found that Portfolio A outperforms Portfolio B around 66% of the time. Question: Am I doing something wrong in my simulation, or is the assumption in the original paper incorrect?
    Copy code
    // Simulation parameters
    val years = 30
    val simulations = 10000
    val initialInvestment = 1.0
    
    // Market probabilities (adjusting bear probability to 30% and bull to 70%)
    val bullProb = 0.75 // 75% for Bull markets
    
    // Portfolio returns
    val portfolioA = mapOf("bull" to 1.20, "bear" to 0.80)
    val portfolioB = mapOf("bull" to 1.25, "bear" to 0.65)
    
    // Function to simulate one portfolio run and return the accumulated return for each year
    fun simulatePortfolioAccumulatedReturns(returns: Map<String, Double>, rng: Random): List<Double> {
        var value = initialInvestment
        val accumulatedReturns = mutableListOf<Double>()
        
        repeat(years) {
            val isBull = rng.nextDouble() < bullProb
            val market = if (isBull) "bull" else "bear"
            value *= returns[market]!!
    
            // Calculate accumulated return for the current year
            val accumulatedReturn = (value - initialInvestment) / initialInvestment * 100
            accumulatedReturns.add(accumulatedReturn)
        }
        return accumulatedReturns
    }
    
    // Running simulations and storing accumulated returns for each year (for each portfolio)
    val rng = Random(System.currentTimeMillis())
    
    val accumulatedResults = (1..simulations).map {
        val accumulatedReturnsA = simulatePortfolioAccumulatedReturns(portfolioA, rng)
        val accumulatedReturnsB = simulatePortfolioAccumulatedReturns(portfolioB, rng)
        
        mapOf("Simulation" to it, "PortfolioA" to accumulatedReturnsA, "PortfolioB" to accumulatedReturnsB)
    }
    
    // Count the number of simulations where Portfolio A outperforms Portfolio B and vice versa
    var portfolioAOutperformsB = 0
    var portfolioBOutperformsA = 0
    accumulatedResults.forEach { result ->
        val accumulatedA = result["PortfolioA"] as List<Double>
        val accumulatedB = result["PortfolioB"] as List<Double>
    
        if (accumulatedA.last() > accumulatedB.last()) {
            portfolioAOutperformsB++
        } else {
            portfolioBOutperformsA++
        }
    }
    
    // Print the results
    println("Number of simulations where Portfolio A outperforms Portfolio B: $portfolioAOutperformsB")
    println("Number of simulations where Portfolio B outperforms Portfolio A: $portfolioBOutperformsA")
    println("Portfolio A outperformed Portfolio B in ${portfolioAOutperformsB.toDouble() / simulations * 100}% of simulations.")
    println("Portfolio B outperformed Portfolio A in ${portfolioBOutperformsA.toDouble() / simulations * 100}% of simulations.")
    👍 2
    a
    • 2
    • 22
  • p

    Paulo Cereda

    04/11/2025, 5:27 PM
    Hi friends! DataFrame question. 🙂 I have a
    .csv
    file in which one the columns has a comma-separated string. I would like to split it and have the line replicated for each element. I have a working code, but it's far from optimal. Code in thread. 🧵
    r
    • 2
    • 7
  • j

    Jason Zhao

    04/14/2025, 6:57 AM
    Hi, I have a question about the Kotlin notebook. Is it not possible to use
    kotlinx.serialization
    in the kotlin notebook? I have a notebook that needs to load my application transiently to perform a task. The application loading fails with the following error when it gets to the config deserialization part. I initially thought it was an issue with version conflict with dependencies, but I excluded the dependencies in question from loading Kotlin Serialization and it is still occurring. Note that when I run the same application logic in a regular Kotlin main function, I don't get this error.
    java.lang.AbstractMethodError: Receiver class ((my class name))$$serializer does not define or inherit an implementation of the resolved method 'abstract kotlinx.serialization.KSerializer[] typeParametersSerializers()' of interface kotlinx.serialization.internal.GeneratedSerializer.
    Note: I am in K2 mode so the issue might be related to that. (On a side note variables are still not detected across cells in the K2 mode.)
    i
    t
    • 3
    • 3
  • u

    Ume Channel

    04/29/2025, 11:13 PM
    Good day! Hello how to fix when the number is overflow - like 1,927,384,739 becomes 1.927384739E9 - I don't want it to be expressed as E Notation
    pjTDYzyiTmEte21D_oIlIY6HL8t_5FH7f-Global Library Data.xlsx
    a
    • 2
    • 8
  • u

    Ume Channel

    04/30/2025, 12:55 AM
    how to include decimal in number when importing xlxs
    a
    • 2
    • 8
  • u

    Ume Channel

    04/30/2025, 10:35 PM
    Temporary solution for overflow Number E Notation: Map it to data class then transform it to a String, transform to Double or null (if null make it 0.0) then make it Big Decimal then recreate the table for recreation - process(on studying)
  • u

    Ume Channel

    05/01/2025, 6:21 PM
    Hello, when i don't have a column names like in the first row names like - country, region, expenditures, total libraries, etc. and my first row is directly values - like Albania, Europe, 132312, 12312, etc. and I query about first this happens - see image There should be a column names first(title) before the values?
    a
    • 2
    • 5
  • k

    Karthick

    05/08/2025, 8:58 AM
    Anybody knows that can we import a variable or functions from one notebook into another notebook?
    a
    • 2
    • 1