https://linen.dev logo
Join Slack
Channels
main
ask-struct-ai
data-sketches
dev
docs-and-training
equinix-imply-external
general
kubernetes-druid
npci-imply-external
random
release
troubleshooting
web-console
Powered by
# dev
  • m

    Maytas Monsereenusorn

    06/15/2024, 1:01 AM
    Will having more GC metrics be useful for other people here? i.e. • Max size of old generation memory pool • Size of old generation memory pool after a full GC • gc promotion rate (Incremented for any positive increases in the size of the old generation memory pool before GC to after GC) • gc allocation rate (Incremented for the increase in the size of the young generation memory pool after one GC to before the next) • Pause time due to GC event • Time spent in concurrent phases of GC
    s
    • 2
    • 2
  • k

    Kumar Basapuram

    07/02/2024, 10:00 AM
    Hello team, Do we support Druid latest version source code compilation support for jdk-11.?
    l
    • 2
    • 5
  • a

    Atul Mohan

    07/02/2024, 10:00 PM
    I think I see an issue while running segment metadata queries against realtime segments containing an HLLSketch column. The
    errorMessage
    field in the query result gives:
    Copy code
    cannot_merge_diff_types: [HLLSketch] and [HLLSketchBuild]
    The complex type for HLL is
    HLLSketchBuild
    in IncrementalIndexStorageAdapter but for persisted segments, the complex type is
    HLLSketch
    and this causes a mismatch during the columnanalysis merge phase. Any ideas on how this can be fixed?
    l
    g
    • 3
    • 4
  • s

    Soman Ullah

    07/09/2024, 2:56 AM
    Is Druid via HTTP API more resilient than JDBC driver API? For higher qps, I often see the following failure from JDBC:
    Copy code
    2024-07-01T10:01:05,378 ERROR [qtp13497839-316] org.apache.druid.sql.avatica.DruidMeta - No such connection: eqq122-cvbbge-5564-lkjh8-4ffljfrsds
  • k

    Krishna Thirumalasetty

    07/18/2024, 10:53 PM
    In the Druid Basic Cluster Tuning guide: https://druid.apache.org/docs/latest/operations/basic-cluster-tuning/#total-memory-usage under “Total Memory Usage” section, there is a statement:
    Copy code
    The Historical will use any available free system memory (i.e., memory not used by the Historical JVM and heap/direct memory buffers or other processes on the system) for memory-mapping of segments on disk.
    What does “Free System Memory” corelate to, in terms of the output of
    free -m
    command. Is “Free System Memory” => “FREE” or “Cache/Buffer Memory”?
    m
    • 2
    • 1
  • s

    Soman Ullah

    07/23/2024, 9:30 PM
    Hello, I follow Step 2 from this imply blog: https://imply.io/blog/upserts-and-data-deduplication-with-druid/ and found that
    latest
    sql command is slow(5-6 seconds). Any ideas on how to improve its performance?
  • a

    Abhishek Agarwal

    07/26/2024, 8:13 AM
    Cross-posting. Also, if you have an idea for a talk and need help in shaping the proposal, I will be happy to help.
  • j

    Jakob Riebe

    08/02/2024, 2:10 PM
    Hi Druid Team, Are there any plans to allow using the multiphase segment merging strategy (
    IndexMergerV9.multiphaseMerge
    - src) when publishing segments from stream ingestion (e.g. kafka)? This strategy can be configured in batch ingestion and compaction by setting
    maxColumnsToMerge!=-1
    but not for stream ingestion. I took a look at the relevant code section (StreamAppenderator.mergeAndPush) and it appears that this is already prepared:
    Copy code
    mergedFile = indexMerger.mergeQueryableIndex(
                indexes,
                schema.getGranularitySpec().isRollup(),
                schema.getAggregators(),
                schema.getDimensionsSpec(),
                mergedTarget,
                tuningConfig.getIndexSpec(),
                tuningConfig.getIndexSpecForIntermediatePersists(),
                new BaseProgressIndicator(),
                tuningConfig.getSegmentWriteOutMediumFactory(),
                tuningConfig.getMaxColumnsToMerge()  // <-- always -1 for stream ingestion tasks (default implementation in `AppenderatorConfig` is never overridden)
            );
    So basically this would "only" require to make
    maxColumnsToMerge
    configurable in the respective
    xxxTaskTuningConfig
    for kafka/kinesis/rabbitmq/etc. and to update the UI (API/WebConsole). Are there any reasons against using multiphase merge in stream ingestion at all or is this simply not (yet) implemented? Thanks in advance!
    g
    • 2
    • 3
  • g

    Gian Merlino

    08/07/2024, 8:51 PM
    Registration is open for Druid Summit 2024!! 🚀 It will be in-person this year, on October 22 in Redwood City, CA in the SF Bay Area. Registration is open here: https://druidsummit.org/. There is a ticket price, but I have some complimentary tickets to offer for folks here on Slack… DM me for a code 😄 I hope to see many of you there!
    🙌 1
    r
    • 2
    • 2
  • h

    Hugh Evans

    08/14/2024, 3:47 PM
    Hi folks, Would anyone working on pydruid be up for me picking their brains about how we could potentially get some of the features we've developed for nice jupyter notebook integration contributed into pydruid? We've got our own version of the API internally within dev rel and it seems a shame to not contribute some of the stuff we've got to OSS - just wanted to check in as it looks like things might be quiet on the project at the moment
    m
    g
    • 3
    • 5
  • e

    Eyal Yurman

    08/15/2024, 4:22 PM
    The AWS SDK package we use (1.x) is deprecated, we need to migrate to the new package (2.x). More details here: https://github.com/apache/druid/issues/16903
    g
    • 2
    • 1
  • e

    Eyal Yurman

    08/20/2024, 8:22 PM
    Before I go through each release notes.... Do you know if version 0.12 segments in deep storage will be seamlessly read by version 30.0?
    g
    o
    l
    • 4
    • 9
  • h

    Hardik Bajaj

    09/09/2024, 5:49 AM
    Hey Druid Team! I have some doubts on
    indexing-service
    that can help me in fixing this issue. I am not able to find how Druid makes sure that replica tasks consuming from same partitions are made sure to not get scheduled on same workers. I don't find any patch preventing it in TaskMaster, TaskRunner or in WorkerSelectStrategy. I'm assuming as replica tasks are next to each other in TaskQueue, this would prevent to get scheduled on same workers (I might be wrong). I'm asking this because, the issue I pointed gets triggered when let's say on a worker, Task -> A, Task Group -> G moves to PUBLISHING and a new actively reading task -> B consuming from same TaskGroup -> G get scheduled on same worker, task A's StreamAppenderator thread
    -appenderator-abandon
    and
    TASK]-publish
    is not able to terminate. Is there any affinity on these appenderator threads to the TaskGroup or partitions which are preventing them from terminate ? This increases the probability of the issue occuring if replicas are increased and workers are decreased, active task gets failed when we reach this state. Any help on the open questions above would be appreciated. Thanks!
  • u

    Utkarsh Chaturvedi

    09/13/2024, 5:24 AM
    Hey Druids! What profile should I be building the project that also packages the community extensions?
    a
    • 2
    • 2
  • s

    Samarth Jain

    09/13/2024, 11:54 PM
    I was thinking of adding an emitter of sorts that would publish an array of datasource -> [columns used] for every Druid query executed. The idea ultimately is to figure out what all dimensions and metrics are not used so that we can tell users to remove them to ultimately reduce data size and improve query performance. I can see how this possibly could be extended to also include the query granularity, filter columns etc. What would be a good place to capture and publish this kind of information? We obviously want to do this after the query has been validated.
    l
    m
    • 3
    • 12
  • v

    Vaibhav Kumar

    09/16/2024, 6:32 PM
    Hello, I am new to druid community and looking for some good starter issue for contributions. I could see a few on https://github.com/apache/druid/issues?q=is%3Aopen+is%3Aissue+label%3AStarter but not sure what would the right one for me to pick up. Can someone help me with it?
    a
    • 2
    • 4
  • c

    Courage Noko

    09/16/2024, 7:14 PM
    Hey Druid team! Imply and Spotify worked on a gRPC extension, we would like committers to review this PR. What is the general process for such requests?
    a
    • 2
    • 4
  • h

    Hardik Bajaj

    09/24/2024, 6:27 PM
    Hey! Can someone please review this PR that is a potential fix for https://github.com/apache/druid/issues/16783 cc: @Amatya Avadhanula @kfaraz @Abhishek Agarwal
    • 1
    • 1
  • e

    Evan Rusackas

    09/27/2024, 7:43 PM
    Wondering if any Druid PMC members (or anyone particularly knowledgable regarding the history/use-cases/community/roadmaps around here) might be interested in joining a couple of us from the Apache Superset project on a "Designated Driver" podcast where we talk about databases and BI over a beer. DM me if interested!
  • k

    Karan Kumar

    10/02/2024, 2:40 AM
    @kfaraz Starting 🧵 to discuss https://github.com/apache/druid/pull/16889#discussion_r1783774321
    k
    m
    • 3
    • 32
  • s

    Shekhar Rajak

    10/14/2024, 6:15 PM
    Hi team, I was looking into https://github.com/druid-io/pydruid , do we have more documents for connecting to flink as source and have analytics using druid in python ?
  • a

    Ashwin Tumma

    10/17/2024, 5:02 AM
    Hi @Adarsh Sanjeev, Can you kindly help me review this PR https://github.com/apache/druid/pull/17362, for small addition to code coverage and fixing a code smell in Prometheus Emitter. Thanks!
    ✅ 1
    k
    p
    • 3
    • 4
  • s

    Shekhar Rajak

    10/19/2024, 12:34 AM
    Hi team, anyone have came across similar error while building ? https://github.com/apache/druid/issues/17375
    a
    • 2
    • 7
  • s

    Shekhar Rajak

    10/22/2024, 4:37 AM
    Hi team, I am exploring druid-iceberg extensiion usinghive catalog : https://druid.apache.org/docs/latest/development/extensions-contrib/iceberg - can you anyone help me redirecting hive catalog examples and how we can query and load iceberg table by configuring hive (or rest catalog ) ? Thanks!
  • s

    Shekhar Rajak

    10/22/2024, 6:50 PM
    Hi team, Please help in testing the PR for glue catalog support: https://github.com/apache/druid/pull/17392
    • 1
    • 1
  • v

    victor regalado

    10/24/2024, 2:53 AM
    Hey i have a PR with a small change for pydruid client to enable support for MSQ engine on the db API. Can someone take a look ? Thank you
  • s

    Suraj Goel

    10/24/2024, 3:03 PM
    Hi Team. I have a PR open from some days. Can someone please review it. TIA !
    a
    • 2
    • 1
  • a

    Ashwin Tumma

    10/25/2024, 2:56 AM
    Hi, I have a small PR for documentation update; https://github.com/apache/druid/pull/17409 ; can someone help review it? Thanks!
    v
    • 2
    • 3
  • k

    Kumar Basapuram

    11/05/2024, 4:54 AM
    Do we support
    druid.emitter
    for
    composing
    type for
    parametrized
    with
    dropwizard
    together.?
    a
    • 2
    • 13
  • s

    Shekhar Rajak

    11/11/2024, 3:57 AM
    Hi team, do we have a direct way to connect to flink as Source ? I usually see examples flink->kafka->druid. Please share any reference/discussions on this.
12345Latest