https://linen.dev logo
Join Slack
Powered by
# troubleshooting
  • s

    sandy k

    07/13/2025, 8:53 AM
    Need some suggestions: 1) We have huge segment versions in postgresB metadata some 20Million unused metadata of 70GB 2) we have overlord crashing due to heap issues and 1000+ supervisors 3) we have to also cleanup deepstorage for unused - XXTB. Primary recommendation is to delete metadata, implement kill. Deletion of postgresDB manually would 45-60 mins activity doing in batches. But due to overlord crashing, one way can be to stop supervisors and do delete of all metadata of segments on db and do s3 delete separately and lately. Post metadata cleanup of db, need to have kill properties implemented. Which approach is advisable 1) kill properties and let master nodes do cleanup, but they are unstable and crashing 2) or do db manual deletion see how it performs and then add kill properties and do manual late s3 deletion. Please suggest
    j
    b
    • 3
    • 3
  • a

    Aqsha Padyani

    07/15/2025, 7:41 AM
    Hi, everyone! Currently I have a supervisor set up to ingest user update events from Kafka to Druid with
    DAY
    segment granularity and these example dimensions:
    Copy code
    "dimensionsSpec": {
      "dimensions": [
        {"type": "string", "name": "user_id", "multiValueHandling": "SORTED_ARRAY", "createBitmapIndex": true},
        {"type": "string", "name": "phone_number", "multiValueHandling": "SORTED_ARRAY", "createBitmapIndex": true},
        {"type": "string", "name": "email_address", "multiValueHandling": "SORTED_ARRAY", "createBitmapIndex": true}
      ]
    }
    I'm trying to set up a compaction to that datasource that compacts segments into
    MONTH
    granularity, and only store the latest entry of each customer on that month:
    Copy code
    dimensionsSpec": {
      "dimensions": [
        {"type": "string", "name": "user_id", "multiValueHandling": "SORTED_ARRAY", "createBitmapIndex": true}
      ]
    },
    "metricsSpec": [
      {"type": "stringLast", "name": "phone_number", "fieldName": "phone_number", "timeColumn": "__time", "maxStringBytes": 1024},
      {"type": "stringLast", "name": "email_address", "fieldName": "email_address", "timeColumn": "__time", "maxStringBytes": 1024}
    ]
    I found out that the
    metricsSpec
    stores the aggregated data in a
    COMPLEX<serializablePairLongString>
    type, which is different to the new/un-compacted data:
    Copy code
    {
      "lhs": 1721882238000,
      "rhs": "+6281234567890"
    }
    Queries with aggregations like
    LATEST()
    still works fine, but retrieving the data with something like
    SELECT *
    produces an error:
    Copy code
    Cannot coerce field [phone_number] from type [java.util.LinkedHashMap] to type [VARCHAR]
    I imagine
    transformSpec.transforms
    can be used to transform those to string, but AFAIK that config is not supported in compaction. Is there any better implementation for this "latest entry of each customer" while keeping the data type stays the same between newly-ingested and compacted data? Or is this "the best way" to implement this, and I should change the query from
    SELECT *
    to something else?
    b
    • 2
    • 6
  • j

    JRob

    07/15/2025, 2:56 PM
    After upgrading to Druid from 29.0.1 to 33.0.0 we are getting a lot of failed queries due to the
    requireTimeCondition
    . Semple query:
    Copy code
    WITH sample_data AS (
      SELECT
        TIME_FLOOR("__time", 'PT5M') AS time_bucket,
        SUM("count") AS volume
      FROM "datasource"
      AND "__time" > CURRENT_TIMESTAMP - INTERVAL '1' day
      GROUP BY 1
    )
    
    SELECT
      time_bucket AS window_end_time,
      TIME_SHIFT(time_bucket, 'PT30M', -1) AS window_start_time,
      SUM(volume) OVER (
        ORDER BY time_bucket
        ROWS BETWEEN 5 PRECEDING AND CURRENT ROW
      ) AS rolling_volume
    FROM sample_data
    I would expect that
    requireTimeCondition
    should only apply to datasource queries and not all queries, yes? Is the solution to simply abandon
    requireTimeCondition
    ? What other guards can I put in place for bad queries?
    b
    • 2
    • 11
  • t

    Tanay Maheshwari

    07/16/2025, 6:18 AM
    Hi Team, I am new to druid. I am running druid on kubernetes in gcp. My historical pods are running on nvme disk nodes. Recently when I tried to deploy historical, the new pods started restarting. I believe its due to health probe timeout as in logs it was loading around 64K segments. In response, I deleted the node (could not find pvc) after which new pod came up. Did the same for multiple pods, which lead to data skewness in historical pods. I want your help to understand the following things - 1. How could I have tackled the situation better 2. how can I prevent coordinator to send more segments to a historical pod when I see its disk is getting full 3. Why on startup the historical pod is loading all the segments? should this not be done after health check has passed
    u
    • 2
    • 1
  • n

    Nir Bar On

    07/16/2025, 11:08 AM
    Hi I build druid-25.0.0 from src , and create a ARM docker image , looking inside the container in “extentions” dirctory - “statsd-emitter” plugin is missing .. 1 - is build product file tar.gz file should containd this plugin ? 2 - in case plugin is not package on the maven build what I need to do to have it inside the container ? 3 - can I somehow install it on docker build time ?
    • 1
    • 2
  • k

    Konstantinos Chaitas

    07/16/2025, 3:17 PM
    Hi everyone! I have a few near real-time ingestions from Kafka into Druid, each landing in separate datasources. I would like to combine all of this data, similar to how you would use a view in a traditional SQL system. Right now, I’m using the
    FROM TABLE(APPEND(...))
    approach, but I would prefer to hide that complexity from end users. Also, some of the UI tools we are using request a single datasource as an input. Is there a way to create a view in Druid, or alternatively, to streamline the data from multiple datasources into a single, unified datasource? Thanks
    b
    • 2
    • 1
  • v

    Victoria

    07/16/2025, 5:18 PM
    Hey everyone. I have some issues with data ingestion from another region. My cluster and deep storage is in
    eu-central-1
    . To make it work, I had to override the
    aws.region=eu-central-1
    via a JVM system property for all services. However, now I cannot seem ingest data from
    us-east-1
    buckets. It throws the error
    Copy code
    Failed to sample data: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: The bucket is in this region: us-east-1. Please use this region to retry the request (Service: Amazon S3; Status Code: 301; Error Code: PermanentRedirect;
    I tried to use the
    endpointConfig
    in the spec, but still without success. Has anyone run into the same issue? (we're using druid 33.0.0)
    Copy code
    "ioConfig": {
          "type": "index_parallel",
          "inputSource": {
            "type": "s3",
            "endpointConfig": {
              "url": "<http://s3.us-east-1.amazonaws.com|s3.us-east-1.amazonaws.com>",
              "signingRegion": "us-east-1"
            },
            "uris": [
              "<s3://x-us-east-1-dev-polaris/segment_events/designer/page/data/processing_date_day=2023-01-01/event_date_day=2022-12-31/00000-306-a018ab59-9017-4b34-8a8a-858de89ee6b7-0-00002.parquet>"
            ]
          }
  • t

    Tanay Maheshwari

    07/16/2025, 7:43 PM
    Hi team, I upgraded my druid cluster from 27.0.0 to 32.0.0 after which lookups have started failing. This is the log I am getting in historical pods -
    Copy code
    2025-07-16T19:38:26,278 WARN [qtp1182725120-124] org.apache.druid.query.lookup.LookupUtils - Lookup [os_lookup] could not be serialized properly. Please check its configuration. Error: Cannot construct instance of `org.apache.druid.query.lookup.namespace.JdbcExtractionNamespace`, problem: java.lang.ClassNotFoundException: org.postgresql.Driver
    I am using "postgresql-metadata-storage" and "mysql-metadata-storage" extensions. In the postgres-metadata-storage extension I have the following jars - checker-qual-3.42.0.jar postgresql-42.7.2.jar postgresql-metadata-storage-32.0.0.jar After checking online I also added mysql-connector-j-8.2.0.jar in mysql-metadata-storage extension folder. Still I am getting this error. Any help in debugging would be appreciated
    • 1
    • 1
  • n

    Nir Bar On

    07/17/2025, 11:23 AM
    Hey getting this error when running druid-25 on ARM (historical) , and when org/apache/druid/java/util/metrics/SysMonitor.java is enabled as it depend on Siguar . Error in custom provider, java.lang.UnsatisfiedLinkError: ‘void org.hyperic.sigar.SigarLog.setLevel(org.hyperic.sigar.Sigar, int)’ when dropping org/apache/druid/java/util/metrics/SysMonitor.java from monitors, no errors is there some alertnative to SysMonitor which is not depend on Siguar ?
    a
    • 2
    • 3
  • t

    Tanay Maheshwari

    07/18/2025, 12:18 PM
    Hi Team, has there been a change in any version after 27.0.0 which fixes compaction task failure due to failure in any subtask like partial_index_generate or partial_dimension_cardinality related to starting of peon due to intermediary file not present ``````
    j
    • 2
    • 8
  • j

    jakubmatyszewski

    07/21/2025, 7:50 AM
    Hey 👋 I've been trying to better understand historical segments loading, since I have 1 cluster which does cold-boot once a week and I'd like to optimize loading segments from s3.
    Copy code
    druid.server.http.numThreads=43
    druid.segmentCache.numLoadingThreads=20
    druid.segmentCache.numBootstrapThreads=40
    I wonder whether setting this values so high makes any sense - I see for
    numLoadingThreads
    the default is
    max(1,Number of cores / 6)
    - in my case it is allowed to have 11 cores. Do you have any recommendations for case like this?
  • e

    Eyal Yurman

    07/21/2025, 10:23 PM
    _In clusters with very high segment counts, it can make sense to separate the Coordinator and Overlord services to provide more resources for the Coordinator's segment balancing workload. (_https://druid.apache.org/docs/latest/design/architecture/) what constitutes a very high segment count?
    j
    • 2
    • 2
  • n

    Nir Bar On

    07/28/2025, 1:36 PM
    Hey , running druid 27.0.0 , executing query for 1 month interval , with single split .. after all configuration changes I did , broker is crash and restart with out of memory exception .. the setup is host momory - 16g JAVA-OPT - “-Xms8g -Xmx8g -XX:MaxDirectMemorySize=4g druid_processing_directMemorySize - 2500000000 (2.5G)
    Copy code
    druid_query_groupBy_maxResults=500000
    druid_query_groupBy_maxIntermediateRows=1000000
    druid_query_groupBy_maxMergingDictionarySize=268435456
    what can be the cause broker is crashing .. , how can I trouble shot this to figure out what I need to do to fix it ? can be that broker is not using the direct memory and instead keep using the heap memory ? broker status paylodd “memory”: { “maxMemory”: 8589934592, “totalMemory”: 8589934592, “freeMemory”: 6974955008, “usedMemory”: 1614979584, “directMemory”: 4294967296 }
    j
    • 2
    • 17
  • t

    Tanay Maheshwari

    07/28/2025, 1:47 PM
    I am running a zookeeper based druid cluster. Today suddenly my coordinator leader restarted (could not find anything in logs). Then queries started failing which were fixed by restarting broker pods. Then ingestion and compaction tasks started failing with error -
    Copy code
    2025-07-28T12:48:46,057 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Exception while running task[AbstractTask{id='index_parallel_supply_view_dohnekmm_2025-07-28T12:48:42.004Z', groupId='index_parallel_supply_view_dohnekmm_2025-07-28T12:48:42.004Z', taskResource=TaskResource{availabilityGroup='index_parallel_supply_view_dohnekmm_2025-07-28T12:48:42.004Z', requiredCapacity=1}, dataSource='supply_view', context={forceTimeChunkLock=true, useLineageBasedSegmentAllocation=true}}]
    java.lang.ClassCastException: class java.lang.Object cannot be cast to class org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseParallelIndexTaskRunner (java.lang.Object is in module java.base of loader 'bootstrap'; org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseParallelIndexTaskRunner is in unnamed module of loader 'app')
    	at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.doGetRowStatsAndUnparseableEvents(ParallelIndexSupervisorTask.java:1786) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.getTaskCompletionUnparseableEvents(ParallelIndexSupervisorTask.java:1271) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.buildIngestionStatsTaskReport(AbstractBatchIndexTask.java:985) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.buildIngestionStatsAndContextReport(AbstractBatchIndexTask.java:950) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.getTaskCompletionReports(ParallelIndexSupervisorTask.java:1254) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.updateAndWriteCompletionReports(ParallelIndexSupervisorTask.java:1276) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.runSinglePhaseParallel(ParallelIndexSupervisorTask.java:681) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.runTask(ParallelIndexSupervisorTask.java:551) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:179) ~[druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:478) [druid-indexing-service-32.0.0.jar:32.0.0]
    	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:450) [druid-indexing-service-32.0.0.jar:32.0.0]
    	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) [guava-32.0.1-jre.jar:?]
    	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:75) [guava-32.0.1-jre.jar:?]
    	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82) [guava-32.0.1-jre.jar:?]
    	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
    	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
    	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
    This was fixed after restarting overlord. But I am unable to explain this behaviour. Is anyone aware of this type of issue?
  • g

    Glenn Huang

    07/30/2025, 1:02 PM
    I have configured
    TaskSlotCountStatsMonitor
    on the Overlord node, but I’m not seeing any metrics related to task slots or worker availability (e.g.,
    taskSlot/total/count
    ,
    taskSlot/used/count
    , etc.). Any help is appreciated. Thanks in advance! Environment: • Platform: Azure AKS • Druid Version: 31.0.2 Overlord startup log and configuration (sensitive info masked):
    overlord_log
  • e

    Eyal Yurman

    08/04/2025, 7:18 PM
    Hello, what could be the reason that my Kafka supervisor creates really small segments? I checked my tuning spec and I think the values are fine, but when checking segment size via querying sys.segments, the segment size in rows and bytes is really low
    j
    b
    • 3
    • 4
  • t

    Tanay Maheshwari

    08/06/2025, 4:34 AM
    Class not found exception when using zstd compression in indexSpec in Druid 32.0.0 Full Stack Trace -
    Copy code
    ERROR [qtp1115073856-99] com.sun.jersey.spi.container.ContainerResponse - The exception contained within MappableContainerException could not be mapped to a respons
    java.lang.NoClassDefFoundError: Could not initialize class com.github.luben.zstd.Zstd                                                                                                       
            at org.apache.druid.segment.data.CompressionStrategy$ZstdDecompressor.decompress(CompressionStrategy.java:425) ~[druid-processing-32.0.0.jar:32.0.0]                                
            at org.apache.druid.segment.data.DecompressingByteBufferObjectStrategy.fromByteBuffer(DecompressingByteBufferObjectStrategy.java:74) ~[druid-processing-32.0.0.jar:32.0.0]  
    
    
    Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.ExceptionInInitializerError: Cannot unpack libzstd-jni-1.5.2-3: No such file or directory [in thread "qtp1115073856-12
            at java.base/java.io.UnixFileSystem.createFileExclusively(Native Method) ~[?:?]                                                                                                     
            at java.base/java.io.File.createTempFile(File.java:2170) ~[?:?]                                                                                                                     
            at com.github.luben.zstd.util.Native.load(Native.java:99) ~[zstd-jni-1.5.2-3.jar:1.5.2-3]                                                                                           
            at com.github.luben.zstd.util.Native.load(Native.java:55) ~[zstd-jni-1.5.2-3.jar:1.5.2-3]                                                                                           
            at com.github.luben.zstd.Zstd.<clinit>(Zstd.java:13) ~[zstd-jni-1.5.2-3.jar:1.5.2-3]                                                                                                
            at org.apache.druid.segment.data.CompressionStrategy$ZstdDecompressor.decompress(CompressionStrategy.java:425) ~[druid-processing-32.0.0.jar:32.0.0]                                
            at org.apache.druid.segment.data.DecompressingByteBufferObjectStrategy.fromByteBuffer(DecompressingByteBufferObjectStrategy.java:74) ~[druid-processing-32.0.0.jar:32.0.0]          
            at org.apache.druid.segment.data.DecompressingByteBufferObjectStrategy.fromByteBuffer(DecompressingByteBufferObjectStrategy.java:30) ~[druid-processing-32.0.0.jar:32.0.0]          
            at org.apache.druid.segment.data.GenericIndexed$BufferIndexed.get(GenericIndexed.java:593) ~[druid-processing-32.0.0.jar:32.0.0]                                                    
            at org.apache.druid.segment.data.BlockLayoutColumnarLongsSupplier$1.loadBuffer(BlockLayoutColumnarLongsSupplier.java:97) ~[druid-processing-32.0.0.jar:32.0.0]                      
            at org.apache.druid.segment.data.BlockLayoutColumnarLongsSupplier$1.get(BlockLayoutColumnarLongsSupplier.java:84) ~[druid-processing-32.0.0.jar:32.0.0]                             
            at org.apache.druid.segment.column.LongsColumn.getLongSingleValueRow(LongsColumn.java:77) ~[druid-processing-32.0.0.jar:32.0.0]                                                     
            at org.apache.druid.segment.QueryableIndexTimeBoundaryInspector.populateMinMaxTime(QueryableIndexTimeBoundaryInspector.java:91) ~[druid-processing-32.0.0.jar:32.0.0]               
            at org.apache.druid.segment.QueryableIndexTimeBoundaryInspector.getMinTime(QueryableIndexTimeBoundaryInspector.java:62) ~[druid-processing-32.0.0.jar:32.0.0]                       
            at org.apache.druid.segment.TimeBoundaryInspector.getMinMaxInterval(TimeBoundaryInspector.java:53) ~[druid-processing-32.0.0.jar:32.0.0]                                            
            at org.apache.druid.server.coordination.ServerManager.buildAndDecorateQueryRunner(ServerManager.java:304) ~[druid-server-32.0.0.jar:32.0.0]                                         
            at org.apache.druid.server.coordination.ServerManager.buildQueryRunnerForSegment(ServerManager.java:257) ~[druid-server-32.0.0.jar:32.0.0]                                          
            at org.apache.druid.server.coordination.ServerManager.lambda$getQueryRunnerForSegments$2(ServerManager.java:208) ~[druid-server-32.0.0.jar:32.0.0]
    g
    • 2
    • 4
  • h

    Harsha Vardhan

    08/06/2025, 3:50 PM
    Hi, I am trying to test out the fixed bucket histogram my ingestion spec looks some thing like below :
    Copy code
    {
      "type": "index_parallel",
      "spec": {
        "ioConfig": {
          "type": "index_parallel",
          "inputSource": {
            "type": "inline",
            "data": "time,session_id,session_duration,country,device_type,timestamp\n2025-08-01T00:00:00,session_0,37,FR,tablet,2025-08-01 00:00:00\n2025-08-01T00:01:00,session_1,240,DE,desktop,2025-08-01 00:01:00\n2025-08-01T00:02:00,session_2,105,BR,tablet,2025-08-01 00:02:00"
          },
          "inputFormat": {
            "type": "csv",
            "findColumnsFromHeader": true
          },
          "appendToExisting": false
        },
        "tuningConfig": {
          "type": "index_parallel",
          "partitionsSpec": {
            "type": "hashed"
          },
          "forceGuaranteedRollup": true,
          "totalNumMergeTasks": 1
        },
        "dataSchema": {
          "dataSource": "buceket_testing",
          "timestampSpec": {
            "column": "time",
            "format": "iso"
          },
          "dimensionsSpec": {
            "dimensions": [
              {
                "name": "device_type",
                "type": "string"
              }
            ]
          },
          "granularitySpec": {
            "queryGranularity": "hour",
            "rollup": true,
            "segmentGranularity": "hour"
          },
          "metricsSpec": [
            {
              "name": "count",
              "type": "count"
            },
            {
              "name": "sessions_bucket",
              "type": "fixedBucketsHistogram",
              "fieldName": "duration",
              "lowerLimit": 0,
              "upperLimit": 100,
              "numBuckets": 10,
              "outlierHandlingMode": "overflow"
            },
            {
              "name": "theta_session_id",
              "type": "thetaSketch",
              "fieldName": "session_id"
            }
          ],
          "transformSpec": {
            "transforms": [
              {
                "type": "expression",
                "name": "duration",
                "expression": "cast(\"session_duration\" ,'long')"
              }
            ]
          }
        }
      }
    }
    my use case is something like finding how many sessions are falling in each bucket 0-10 : 5 sessions 10-20 : 1 session ...etc I am unable to query the datasource to achieve this.. can someone help ?
  • p

    Przemek

    08/13/2025, 9:09 AM
    Hi, I'm trying to use MM-less Druid in K8s, but have issue with
    partial_index_generic_merge
    tasks - they are unable to load segments and I see in logs such info:
    Copy code
    2025-08-08T15:48:18,234 WARN [Segment-Bootstrap-0] org.apache.druid.segment.loading.StorageLocation - Segment[Golf_Gold_GolfCommentary_2024-05-29T00:00:00.000Z_2024-05-30T00:00:00.000Z_2024-05-30T23:16:44.708Z:92,692] too large for storage[/opt/druid/var/tmp/persistent/task/broadcast/segments:-1]. Check your druid.segmentCache.locations maxSize param
    which would mean that availableSizeBytes return
    -1
    . I have
    druid.segmentCache.locations
    and
    druid.server.maxSize
    set
    Copy code
    druid.segmentCache.locations: '[{"path":"/opt/druid/var/data/segments", "maxSize":1500000000000}]'
    druid.server.maxSize: "1500000000000"
    but in logs there is info that segment is
    too large for storage[/opt/druid/var/tmp/...
    which is in historical config as
    Copy code
    druid.processing.tmpDir: "/opt/druid/var/tmp"
    How that configs are correlated? I have also same path used for peons:
    Copy code
    druid.indexer.fork.property.druid.processing.tmpDir: "/opt/druid/var/tmp"
    druid.indexer.fork.property.druid.indexer.task.baseDir: "/opt/druid/var/tmp"
    Can anybody help suggest what can be missed/misconfigured then?
  • a

    A.Iswariya

    08/18/2025, 6:53 AM
    👋 Hello, team!
  • a

    A.Iswariya

    08/18/2025, 12:28 PM
    https://druid.apache.org/docs/latest/development/extensions-contrib/druid-ranger-security/ Hi team, I want to integrate Apache druid with Apache ranger for this I have done the setup by reffering the above documentation but I couldn't find the ranger-servicedef-druid.json file. Kindly help me on this.
  • m

    Mateusz Kalinowski

    08/18/2025, 1:45 PM
    Hey Guys! Is it possible to set a password from an environment variable on Druid lookups? I was trying something like this
    Copy code
    {
      "type": "cachedNamespace",
      "extractionNamespace": {
        "type": "jdbc",
        "pollPeriod": "PT1H",
        "connectorConfig": {
          "connectURI": "jdbc:<mysql://database:3306/table>",
          "user": {
            "type": "environment",
            "variable": "MYSQL_USERNAME"
          },
          "password": {
            "type": "environment",
            "variable": "MYSQL_PASSWORD"
          }
        },
        "table": "Test",
        "keyColumn": "id",
        "valueColumn": "name"
      }
    }
    But this gives me:
    Copy code
    org.apache.druid.query.lookup.LookupUtils - Lookup [mk_test] could not be serialized properly. Please check its configuration. Error: Cannot deserialize value of type `java.lang.String` from Object value (token `JsonToken.START_OBJECT`)
    2025-08-18 14:52:55.818
     at [Source: (byte[])":)
    This could mean that the configuration is incorrect. If I set the values directly, the lookup works as expected. Will be grateful for any advice on this.
    k
    • 2
    • 5
  • u

    Utkarsh Chaturvedi

    08/19/2025, 10:17 AM
    Hi team. Wanted to understand how compactions and ingestions in compacted datasources are expected to be working. 1. I have a datasource with data from 2024. Granularity set to Day. I set up its auto compaction with granularity set to Month. Offset day set to 10. Even so on Aug 19 : The segments seem to be month granularity up till July and then Day granularity till Aug 19. Is this expected behaviour? Are these Day level segments going to be there till Aug end? 2. Now I run an ingestion for a time period July 25 - Aug 5. Firstly this ingestion breaks as OVERWRITE WHERE clause identified interval which is not aligned with
    PARTITIONED BY granularity.
    This I figure is because the date range is split between month level segments and day level segments. So I break the ingestion into 2 : Before the month level change and after the month level change. So I run an ingestion July 25 - July 31. This works but only with DAY granularity. So this makes me uncertain about whether or not the ingestion earlier was breaking because of the underlying segment granularity. 3. Now the ingestion for July 25 - July 31. creates 7 day level segments : But they are not getting compacted : Compaction is saying 100% compacted except for last 10 days. Not seeing these uncompacted segments. Shouldn't these segements be relevant for compaction? If anybody who understands compaction well, can help with this. Would be appreciated.
  • j

    Jesse Tuglu

    08/21/2025, 6:54 PM
    Have any folks run into ZK connection reconnect/suspend issues using the latest Druid v34 with this change enabled? • Curator version =
    5.8.0
    • ZK server version =
    3.5.8
    • ZK client version =
    3.8.4
    Wondering if this ZK client/server version mismatch could be the R/C of things
  • m

    Milad

    08/21/2025, 8:36 PM
    Hello: I'm using the new sql SET statement in version 34 to set the query context. It is working well, but I noticed that you cannot seem to SET the resultFormat. For example when I try:
    Copy code
    set resultFormat = 'csv';
    it has no effect. Does anyone know if that was by design?
    j
    • 2
    • 2
  • t

    Tanay Maheshwari

    08/23/2025, 7:23 AM
    I restarted my broker and since then getting class Not found exception for postgres/ssl/lazykeymanager. Is anyone aware, any help would be appreciated -
    Copy code
    2025-08-23T06:46:02,369 ERROR [NamespaceExtractionCacheManager-0] org.apache.druid.server.lookup.namespace.cache.CacheScheduler - Failed to update namespace [JdbcExtractionNamespace{connec
    java.lang.NoClassDefFoundError: org/postgresql/ssl/LazyKeyManager                                                                                                                           
            at org.postgresql.ssl.LibPQFactory.initPk8(LibPQFactory.java:85) ~[postgresql-42.7.2.jar:42.7.2]                                                                                    
            at org.postgresql.ssl.LibPQFactory.<init>(LibPQFactory.java:123) ~[postgresql-42.7.2.jar:42.7.2]                                                                                    
            at org.postgresql.core.SocketFactoryFactory.getSslSocketFactory(SocketFactoryFactory.java:61) ~[postgresql-42.7.2.jar:42.7.2]                                                       
            at org.postgresql.ssl.MakeSSL.convert(MakeSSL.java:34) ~[postgresql-42.7.2.jar:42.7.2]                                                                                              
            at org.postgresql.core.v3.ConnectionFactoryImpl.enableSSL(ConnectionFactoryImpl.java:625) ~[postgresql-42.7.2.jar:42.7.2]                                                           
            at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:195) ~[postgresql-42.7.2.jar:42.7.2]                                                          
            at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:262) ~[postgresql-42.7.2.jar:42.7.2]                                                  
            at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:54) ~[postgresql-42.7.2.jar:42.7.2]                                                                  
            at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:273) ~[postgresql-42.7.2.jar:42.7.2]                                                                                   
            at org.postgresql.Driver.makeConnection(Driver.java:446) ~[postgresql-42.7.2.jar:42.7.2]                                                                                            
            at org.postgresql.Driver.connect(Driver.java:298) ~[postgresql-42.7.2.jar:42.7.2]                                                                                                   
            at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:681) ~[java.sql:?]                                                                                              
            at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:229) ~[java.sql:?]                                                                                              
            at org.skife.jdbi.v2.DBI$3.openConnection(DBI.java:140) ~[jdbi-2.63.1.jar:2.63.1]                                                                                                   
            at org.skife.jdbi.v2.DBI.open(DBI.java:212) ~[jdbi-2.63.1.jar:2.63.1]                                                                                                               
            at org.skife.jdbi.v2.DBI.withHandle(DBI.java:279) ~[jdbi-2.63.1.jar:2.63.1]                                                                                                         
            at org.apache.druid.server.lookup.namespace.JdbcCacheGenerator.lastUpdates(JdbcCacheGenerator.java:211) ~[?:?]                                                                      
            at org.apache.druid.server.lookup.namespace.JdbcCacheGenerator.generateCache(JdbcCacheGenerator.java:72) ~[?:?]                                                                     
            at org.apache.druid.server.lookup.namespace.JdbcCacheGenerator.generateCache(JdbcCacheGenerator.java:48) ~[?:?]                                                                     
            at org.apache.druid.server.lookup.namespace.cache.CacheScheduler$EntryImpl.tryUpdateCache(CacheScheduler.java:234) ~[?:?]                                                           
            at org.apache.druid.server.lookup.namespace.cache.CacheScheduler$EntryImpl.updateCache(CacheScheduler.java:206) ~[?:?]                                                              
            at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]                                                                                          
            at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]                                                                                                 
            at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]                                                   
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]                                                                                  
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]                                                                                  
            at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
    g
    • 2
    • 4
  • t

    Thuận Trần Văn

    08/28/2025, 4:33 AM
    Hi. I use apache druid verion 0.20.0 inventoryId type long how to fix it Thanks
    Screencast from 2025-08-28 11-22-41.webm
    b
    • 2
    • 3
  • y

    Yotam Bagam

    09/02/2025, 1:14 PM
    Using Druid 30 We are facing issues when trying using middlemanger MSQ ingestion We have large tasks which we split into workers, The issue is when The control task started it's looking for free X workers to assign the task to, but if there are no free workers it dies on timeout quickly (10 minutes) Is there anyway to bypass the timeout? Any other way to manage it specially when we have quite a few long tasks? P.s we currently assign control tasks to their own middlemanager to avoid deadlocks of only control tasks running
    b
    • 2
    • 3
  • v

    Vinothkumar Venkatesan

    09/03/2025, 7:44 AM
    Hello Druid Community, I am currently working on securing Apache Druid and would like to integrate it with Apache Ranger for centralized authorization and policy management. I have reviewed the Druid security documentation and Ranger configuration, but since there isn’t an official Ranger plugin for Druid, I’m trying to understand: 1. What is the recommended approach to connect Druid with Ranger for fine-grained access control (row/column level, user-based policies, etc.)? 2. Are there any community-driven plugins, extensions, or best practices available for this integration? 3. If Ranger cannot be directly integrated, what are the alternative approaches the community suggests for centralized policy enforcement with Druid? Any guidance, references, or examples would be greatly appreciated. Thanks in advance for your support!
  • s

    Sanjay Dowerah

    09/03/2025, 10:54 AM
    Hello Druid Community, I am running Druid on an Openshift cluster, and using the Druid Delta Lake extension(https://github.com/apache/druid/tree/master/extensions-contrib/druid-deltalake-extensions) to connect and load Delta tables. However, I am running into the following issue, • error while loading with delta connector: only 1024 records of each constituent parquet file is loaded • ◦ Update: The default result row limit for multi stage query using the Delta Lake extension is 1024. As the delta lake connection does not allow to edit the ingestion spec, looking for a solution to override the default Is this a known issue? Any help with a workaround would be highly appreciated
    a
    • 2
    • 4
1...4950515253Latest