https://www.getdaft.io logo
Join Slack
Powered by
# daft-dev
  • d

    Desmond Cheong

    05/22/2025, 11:07 PM
    @Kevin Wang https://github.com/Eventual-Inc/Daft/pull/4410 might help our poor github runners
    k
    s
    • 3
    • 23
  • c

    Cory Grinstead

    05/23/2025, 3:11 PM
    small PR to fix a regression in str.substr https://github.com/Eventual-Inc/Daft/pull/4415
  • r

    Robert Howell

    05/23/2025, 5:53 PM
    @Cory Grinstead little PR to add a prelude for the required ScalarUDF imports. https://github.com/Eventual-Inc/Daft/pull/4416
    • 1
    • 1
  • c

    Colin Ho

    05/27/2025, 5:17 PM
    Cutting release today, please list any blockers in the ๐Ÿงต
    c
    • 2
    • 2
  • s

    Srinivas Lade

    05/27/2025, 11:07 PM
    Very excited for this, can't wait to see the final move to logical plans so we can start refactoring the optimizer rules
  • r

    Robert Howell

    05/27/2025, 9:13 PM
    PR to enable scalar function lowering e.g. overloads which let's us do type-checking then lowering in the planner rather than during evaluation hence 'dynamic' naming in the PR. https://github.com/Eventual-Inc/Daft/pull/4431
  • r

    Robert Howell

    05/28/2025, 7:03 PM
    PR to create a Daft value from a JSON string! โ€ข reviewer: @Cory Grinstead โ€ข customer request: @NikkTheGreek via https://github.com/Eventual-Inc/Daft/issues/3994 โ€ข PR: https://github.com/Eventual-Inc/Daft/pull/4438 Example
    Copy code
    import daft
    
    df = daft.from_pydict({
    
        "person": [
            '{"name": "Alice", "age": 30}',
            '{"name": "Bob", "age": 25}',
            '{"name": "Charlie", "age": 35}',
        ]
    })
    
    # STRUCT<name: STRING, age: BIGINT>
    person_type = dt.struct(
        {
            "name": dt.string(),
            "age": dt.int64(),
        }
    )
    
    df.select(df["person"].from_json(person_type)).show()
    
    โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
    โ”‚ person                         โ”‚
    โ”‚ ---                            โ”‚
    โ”‚ Struct[name: Utf8, age: Int64] โ”‚
    โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
    โ”‚ {name: Alice,                  โ”‚
    โ”‚ age: 30,                       โ”‚
    โ”‚ }                              โ”‚
    โ”œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”ค
    โ”‚ {name: Bob,                    โ”‚
    โ”‚ age: 25,                       โ”‚
    โ”‚ }                              โ”‚
    โ”œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”ค
    โ”‚ {name: Charlie,                โ”‚
    โ”‚ age: 35,                       โ”‚
    โ”‚ }                              โ”‚
    โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
    
    (Showing first 3 of 3 rows)
    ๐Ÿ™Œ 3
    ๐Ÿ™ 1
    n
    • 2
    • 3
  • k

    Kevin Wang

    05/29/2025, 1:26 AM
    @Robert Howell PR to fully implement MemoryTable and MemoryCatalog: https://github.com/Eventual-Inc/Daft/pull/4445
    ๐Ÿ™Œ 1
    ๐Ÿ”ฅ 3
    • 1
    • 2
  • r

    Robert Howell

    05/30/2025, 10:15 PM
    @Kevin Wang back at you! Here's a deprecated API cleanup, and I've added a CTE map to daft.sql which better fits the SQL model and allows us to deprecate then remove SQLCatalog. https://github.com/Eventual-Inc/Daft/pull/4460
  • g

    Garrett Weaver

    05/30/2025, 11:10 PM
    ๐Ÿ‘‹ I am querying trino via daft and running into some unexpected behavior. I have one query where I left join a table and one of the columns ends up having all null values. when written to parquet, this column ends up with a type of null, but it should have type string. this is unexpected behavior, right?
    c
    • 2
    • 11
  • r

    Robert Howell

    06/02/2025, 11:13 PM
    @Kevin Wang @Cory Grinstead PR for a flattened
    .jq
    method. https://github.com/Eventual-Inc/Daft/pull/4470
    • 1
    • 1
  • c

    Cory Grinstead

    06/03/2025, 6:37 PM
    small PR to get daft-dashboard working without errors again. https://github.com/Eventual-Inc/Daft/pull/4475 Note that the visualization is completely disabled for now until we can come up with a better visualization than the ugly and broken mermaid viz.
  • k

    Kevin Wang

    06/03/2025, 11:13 PM
    @jay PR ready for UC volumes support!! I'll get a demo notebook for you soon as well https://github.com/Eventual-Inc/Daft/pull/4476
    ๐Ÿ™Œ 1
    c
    • 2
    • 2
  • c

    Cory Grinstead

    06/04/2025, 4:02 PM
    Expr refactor is officially done after this PR is merged! https://github.com/Eventual-Inc/Daft/pull/4480
    ๐Ÿ™Œ 1
  • g

    Giridhar Pathak

    06/04/2025, 7:03 PM
    list type column iteration/processing ๐Ÿงต
    r
    c
    • 3
    • 11
  • g

    Giridhar Pathak

    06/04/2025, 7:12 PM
    isinstance() checks on data frame column values. ๐Ÿงต
    j
    • 2
    • 2
  • g

    Garrett Weaver

    06/05/2025, 5:17 PM
    probably missing this in the docs, how can I get the datatype of the elements in a list type?, nvm
    .dtype
  • c

    Cory Grinstead

    06/05/2025, 5:40 PM
    FYI, I have a few PR's I'd like to get reviewed. https://github.com/Eventual-Inc/Daft/pulls?q=is%3Aopen+is%3Apr+author%3Auniversalmind303+-is%3Adraft
  • c

    Cory Grinstead

    06/05/2025, 10:16 PM
    PR to do a little bit of performance cleanup on the expressions (ScalarFunction). https://github.com/Eventual-Inc/Daft/pull/4489 cc @Robert Howell I think this moves us a bit closer to your intended usage of the
    ScalarFunctionFactory
    as the
    ScalarFunction
    is no longer directly storing the
    ScalarUDF
    , but instead resolving it during planning.
    r
    k
    • 3
    • 28
  • k

    Kevin Wang

    06/11/2025, 1:41 AM
    @Sammy Sidhu PR to update our AWS crates. Will work on removing our dependency/vendoring of openssl as a next step https://github.com/Eventual-Inc/Daft/pull/4508
  • c

    Cory Grinstead

    06/11/2025, 9:17 PM
    Could I get a quick review on this PR? It's to help debug an issue a daft spark user is facing https://github.com/Eventual-Inc/Daft/pull/4521
  • c

    Cory Grinstead

    06/11/2025, 11:49 PM
    could I also get a review on this one here https://github.com/Eventual-Inc/Daft/pull/4524 related to the same issue as 4521. cc @Kevin Wang
  • e

    Everett Kleven

    06/12/2025, 7:24 PM
    Quick heads up: Major Service outage on GCP -> No claude.
    ๐Ÿฅฒ 2
  • g

    Giridhar Pathak

    06/17/2025, 12:40 AM
    any plans to add a Dataframe.write_json() function to compliment the write_csv() and write_parquet() etc?
    k
    d
    • 3
    • 5
  • z

    Zhiping Wu

    06/20/2025, 3:22 AM
    Hi, may I check do we have any github workflow bot which accept rerun command to rerun the failed CI? example from hudi as bellow picture shows, or how can I rerun/re-trigger the failed ci without re-submit a new change?
    c
    • 2
    • 5
  • x

    Xianyang Liu

    06/23/2025, 8:12 AM
    Hi, has anybody met a problem such as the following when debugging with IntelliJ? Seems like the io module shadows the built-in io.
  • e

    Everett Kleven

    06/24/2025, 1:06 AM
    Working on an issue focused on arrow schema mismatches . Is there any known work related to this besides: https://github.com/Eventual-Inc/Daft/issues/3605 https://github.com/Eventual-Inc/Daft/issues/1958
    r
    • 2
    • 3
  • r

    Robert Howell

    06/25/2025, 11:22 PM
    @Desmond Cheong @Srinivas Lade here's a PR that enables us to pushdown filters into LanceDB. https://github.com/Eventual-Inc/Daft/pull/4616
    ๐Ÿ™Œ 2
    • 1
    • 2
  • m

    Matthew Powers

    06/26/2025, 12:21 AM
    Does Daft support geospatial types now? Any posts/info where I can learn more?
    r
    • 2
    • 4
  • x

    Xin Xianyin

    06/30/2025, 12:37 PM
    hello, iโ€™m a new developer. It seems daft used both pyproject.toml and requirements.txt to manage the dependencies. whatโ€™s the relation between the two? why we donโ€™t only use pyproject.toml and uv to manage the dependencies?