Swagat
03/05/2025, 1:40 PMMohit Dhingra
03/06/2025, 8:44 AMtaskslot
. I can see in the logs that the metrics monitors are loading, but the metrics are not appearing. Can someone suggest a solution?
kubectl logs druid-overlord-2 -c druid-overlord | grep org.apache.druid.server.metrics
{"instant":{"epochSecond":1739883011,"nanoOfSecond":51244171},"thread":"main","level":"DEBUG","loggerName":"org.apache.druid.guice.JsonConfigurator","message":"Loaded class[class org.apache.druid.server.metrics.MonitorsConfig] from props[druid.monitoring.] as [MonitorsConfig{monitors=[class org.apache.druid.java.util.metrics.JvmMonitor, class org.apache.druid.server.metrics.TaskCountStatsMonitor, class org.apache.druid.server.metrics.TaskSlotCountStatsMonitor]}]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","contextMap":{},"threadId":1,"threadPriority":5}
{"instant":{"epochSecond":1739883012,"nanoOfSecond":60148915},"thread":"main","level":"INFO","loggerName":"org.apache.druid.server.metrics.MetricsModule","message":"Loaded 5 monitors: org.apache.druid.java.util.metrics.JvmMonitor, org.apache.druid.server.metrics.TaskCountStatsMonitor, org.apache.druid.server.metrics.TaskSlotCountStatsMonitor, org.apache.druid.curator.DruidConnectionStateListener, org.apache.druid.server.initialization.jetty.JettyServerModule$JettyMonitor","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","contextMap":{},"threadId":1,"threadPriority":5}
Stefanos Pliakos
03/06/2025, 12:18 PMDinesh
03/13/2025, 5:52 AMymcao
03/17/2025, 8:22 AM2025-03-17T05:50:23,080 INFO [LeaderSelector[/druid/coordinator/_COORDINATOR]] org.apache.druid.server.coordinator.DruidCoordinator - I am no longer the leader... 2025-03-17T05:50:24,370 INFO [LeaderSelector[/druid/coordinator/_COORDINATOR]] org.apache.druid.server.coordinator.DruidCoordinator - I am the leader of the coordinators, all must bow! Starting coordination in [PT30S].
3. There are no new segment assignments, but the following log is present with no error logs:
a. Polled and found 201 rule(s) for 193 datasource(s).
I have attempted the following approaches to recover from the issue:
1. Restarted the coordinator leader and follower, but this did not help.
2. Restarted the Zookeeper follower, but this did not help.
3. Restarted the Zookeeper leader, which resolved the issue.
Does anyone have a similar experience? thanks and look forward to your help 🙏Zeyu Chen
03/18/2025, 1:54 AMDruidMeta.execute/fetch
, we should expect to see 2 threads in the broker JVM:
• a JDBCQueryExecutor-connection-XXX
thread running the query
• a qtp jetty thread waiting on a future from the JDBC thread
From time to time, I see a long running (9+minutes) JDBC thread without a corresponding qtp jetty thread. The JDBC thread would have a minimal waiting stack like the following:
"JDBCQueryExecutor-connection-2ee5051c-5420-426b-b9b1-2c9b4e548b83-statement-1" #34085356 daemon prio=5 os_prio=0 tid=0x00007f9bf40da800 nid=0x24cb52 waiting on condition [0x00007f994b9fe000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007fafa0189278> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2044)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
What could be the cause of this? Is this normal?PHP Dev
03/18/2025, 9:46 PM32.0.0
It seems that filtered doubleSum
aggregator now returns NULL
instead of 0
in case of no rows for such filter. Because of SQL compliant mode
. For example
{
"type": "filtered",
"name": "FilteredAggregator",
"filter": {
"type": "selector",
"dimension": "event_id",
"value": "aaaaaa"
},
"aggregator": {
"name": "FilteredAggregator",
"type": "doubleSum",
"fieldName": "event_value"
}
}
And it seems that option druid.generic.useDefaultValueForNull
that could help in previous versions no longer available.
For SQL
queries we can use NVL
. But how can I fix it for native
?
Please helpMahesha Subrahamanya
03/19/2025, 4:16 PMPHP Dev
03/20/2025, 2:14 PMNoor
03/21/2025, 6:47 AMSivakumar Karthikesan
03/23/2025, 9:26 AMTeam .,, in our of the prod cluster where we are seeing latency issue , it takes 4 to 5s to get the result. other datasource works fine and doesnt have any latency. any suggestion please
select tenantId, systemId, TIMESTAMP_TO_MILLIS(__time) as "timestamp", sum(iops_pref_pct) as iops_pref_pct from (select DISTINCT(__time),* from "xyzdatasource" where systemId='aaajjjjccccc' and __time >= MILLIS_TO_TIMESTAMP(1742252400000) and __time <= MILLIS_TO_TIMESTAMP(1742338800000)) group by __time, tenantId,systemId order by __time asc
Utkarsh Chaturvedi
03/24/2025, 8:44 AMJulian Reyes
03/25/2025, 1:22 PMstrikers
03/27/2025, 6:27 AMCaused by: java.lang.RuntimeException: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: A header you provided implies functionality that is not implemented. (Service: Amazon S3; Status Code: 501; Error Code: NotImplemented; Request ID: null; S3 Extended Request ID: null; Proxy: null), S3 Extended Request ID: null
When Pure Stroge is used as Storge, it can read segments from S3 (get) and save them to the local directory, but it cannot write segments to Pure S3 (put).
When I give MinIO (S3) instead of Pure Storge without changing the configs, Druid can write logs and segments into the bucket. But with the same configs, it cannot write segments and logs into Pure Storge.
S3 Configs:
druid_storage_type: s3
druid_storage_baseKey: warehouse
druid_storage_bucket: druid
druid_storage_storageDirectory: <s3a://druid/warehouse/>
druid_indexer_logs_type: s3
druid_indexer_logs_directory: <s3a://druid/logs/>
druid_indexer_logs_s3Bucket: druid
druid_indexer_logs_s3Prefix: logs
druid_storage_useS3aSchema: "true"
druid_s3_disableChunkedEncoding: "true"
druid_s3_accessKey: "xxxx"
druid_s3_secretKey: "yyyy"
druid_s3_protocol: http
druid_s3_enablePathStyleAccess: "true"
druid_s3_endpoint_signingRegion: us-east-1
druid_s3_endpoint_url: <http://zzz.com>
druid_s3_forceGlobalBucketAccessEnabled: "true"
Can you help me how to write Druid data to Pure Storage (S3) using the same S3 protocol as MinIO?Utkarsh Chaturvedi
04/01/2025, 6:24 AMKrishna
04/02/2025, 4:16 AMQueryInterruptedException{msg=java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/druid-groupBy-ef4e7e51-ea6b-48be-8d40-08fd92fb64c6_f78e666b-0d64-4c01-8b21-5f16487d58bd/00271801.tmp (Too many open files) Apache druid
Vincent Lao
04/02/2025, 9:40 AMapache/druid:30.0.1
). But it doesn't seem to be working for me:
1. I have uploaded duplicated data (the wikipedia sample data), where only the latest one has active = true
2. Enabled the auto kill segment feature by manually adding the following variables in the coordinator’s runtime.properties and restarted the container
-- Directly Adding into runtime.properties & restart container
echo "druid.coordinator.kill.on=true" >> /opt/druid/conf/druid/cluster/master/coordinator-overlord/runtime.properties
echo "druid.coordinator.killAllDataSources=true" >> /opt/druid/conf/druid/cluster/master/coordinator-overlord/runtime.
properties
echo "druid.coordinator.kill.ignoreDurationToRetain=true" >> /opt/druid/conf/druid/cluster/master/coordinator-overlord
/runtime.properties
3. Check log for kill tasks
-- LOG to verify autokill config
2025-03-31 12:16:13 2025-03-31T11:16:13,396 INFO [main] org.apache.druid.cli.CliCoordinator - * druid.coordinator.kill.bufferPeriod: PT0S
2025-03-31 12:16:13 2025-03-31T11:16:13,396 INFO [main] org.apache.druid.cli.CliCoordinator - * druid.coordinator.kill.ignoreDurationToRetain: true
2025-03-31 12:16:13 2025-03-31T11:16:13,396 INFO [main] org.apache.druid.cli.CliCoordinator - * druid.coordinator.kill.on: true
2025-03-31 12:16:13 2025-03-31T11:16:13,397 INFO [main] org.apache.druid.cli.CliCoordinator - * druid.coordinator.killAllDataSources: true
-- LOG to verify Kill task is scheduled
2025-03-31 12:16:20 2025-03-31T11:16:20,099 INFO [LeaderSelector[/druid/coordinator/_COORDINATOR]] org.apache.druid.server.coordinator.duty.KillUnusedSegments - druid.coordinator.kill.durationToRetain[PT7776000S] will be ignored when discovering segments to kill because druid.coordinator.kill.ignoreDurationToRetain is set to true.
2025-03-31 12:16:20 2025-03-31T11:16:20,100 INFO [LeaderSelector[/druid/coordinator/_COORDINATOR]] org.apache.druid.server.coordinator.duty.KillUnusedSegments - Kill task scheduling enabled with period[PT1800S], durationToRetain[IGNORING], bufferPeriod[PT0S], maxSegmentsToKill[100]
4. It appears Druid is clearing the metadata (almost instantly), as it shows “No Segments to load/ drop”. While sys.segment also only showing the latest uploaded segment now
5. But files are not removed from deepstorage (currently at local directory)
6. However, manually triggering the kill task from the UI removes the files successfully. And I can see a kill task in the “Tasks” tabVincent Lao
04/02/2025, 9:45 AMSubin C Mohan
04/07/2025, 12:23 PM이세찬
04/09/2025, 8:12 AMDinesh
04/10/2025, 6:53 AMDinesh
04/11/2025, 12:33 PM"druid-kubernetes-overlord-extensions"
• configs in overlord
config:
druid_indexer_runner_namespace: <namespace>
druid_indexer_queue_maxSize: 10
druid_processing_intermediaryData_storage_type: deepstore
#druid_indexer_runner_capacity: 2147483647
druid_indexer_runner_type: k8s
druid_indexer_task_encapsulatedTask: true
druid_peon_mode: remote
druid_service: druid/peon
druid_indexer_runner_k8s_adapter_type: overlordSingleContainer
druid_indexer_runner_javaOptsArray: '["-server", "-Xms1g", "-Xmx2g", "-XX:MaxDirectMemorySize=5g", "-Duser.timezone=UTC", "-Dfile.encoding=UTF-8", "-XX:+ExitOnOutOfMemoryError", "-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager"]'
#druid_indexer_fork_property_druid_processing_buffer_sizeBytes: '104857600'
druid_emitter_prometheus_port: 9090
druid_indexer_runner.k8s_overlordUrl: "<http://druid-overlord:8081>"
• Disabled middle manager deployment with druid deployment.
• Created RABC with below config as mentioned in druid documentation
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: sparknet-applications
name: druid-k8s-task-scheduler
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "watch", "list", "delete", "create"]
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "watch", "list", "delete", "create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: druid-k8s-binding
namespace: sparknet-applications
subjects:
- kind: ServiceAccount
name: druid-overlord
namespace: sparknet-applications
roleRef:
kind: Role
name: druid-k8s-task-scheduler
apiGroup: rbac.authorization.k8s.io
Our ingestion type is kafka ingestion. the ingestion job getting spawned on k8s cluster, jobs/tasks are loading lookups and starting task lifecyle. But tasks gets stuck after starting and eventually failing with
"errorMsg": "Peon did not report status successfully."
Can someone please help what is causing this problem ?Abdullah Ömer Yamaç
04/20/2025, 11:03 PM2025-04-20T22:58:55,359 INFO [[compact_mobility_nclokegf_2025-04-20T22:56:56.871Z]-batch-appenderator-push] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Push started, processsing[1] sinks
Terminating due to java.lang.OutOfMemoryError: Java heap space
The size of the compacting data is 1.2 GB.Mahesha Subrahamanya
04/23/2025, 12:40 AMSachit Swaroop NB
04/23/2025, 4:35 PMAndrew Ho
04/25/2025, 7:06 PMMahesha Subrahamanya
04/26/2025, 7:57 PMmiddlemanager:
replicas: 5
minReplicas: 5
maxReplicas: 18
numMergeBuffers: 2
bufferSizeBytes: 120MiB
numThreadsProcessing: 2
numThreadsHttp: 32
workerCapacity: 2
runnerJavaOpts:
xms: 2g
xmx: 12g
MaxDirectMemorySize: 2g
cpuRequest: 5000m
memoryRequest: 30Gi
memoryLimit: 30Gi
ephemeralStorageLimit: "32Gi"
Lis Shimoni
04/28/2025, 3:03 PMInvalid value for the field [inputSource]. Reason: [Cannot construct instance of `org.apache.druid.iceberg.input.GlueIcebergCatalog`, problem: Cannot initialize Catalog implementation org.apache.iceberg.aws.glue.GlueCatalog: Cannot find constructor for interface org.apache.iceberg.catalog.Catalog Missing org.apache.iceberg.aws.glue.GlueCatalog [java.lang.ClassNotFoundException: org.apache.iceberg.aws.glue.GlueCatalog]
Abhishek Balaji Radhakrishnan
04/28/2025, 10:08 PMLuke Foskey
05/01/2025, 3:21 AM