raghav
10/13/2025, 3:19 PM2025/10/13 07:46:07.467 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( Disconnected )
2025/10/13 07:46:07.472 WARN [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10000184ff502de, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 07:46:09.059 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( SyncConnected )
2025/10/13 07:46:09.059 INFO [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState: SyncConnected, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 07:46:21.387 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( Disconnected )
2025/10/13 07:46:21.387 WARN [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10000184ff502de, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 07:46:22.025 WARN [ZKHelixManager] [message-count-scheduler-0] zkClient to pinot-zookeeper:2181 is not connected, wait for 10000ms.
2025/10/13 07:46:32.028 ERROR [ZKHelixManager] [message-count-scheduler-0] zkClient is not connected after waiting 10000ms., clusterName: d3-pinot-cluster, zkAddress: pinot-zookeeper:2181
2025/10/13 07:46:34.790 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( SyncConnected )
2025/10/13 07:46:34.790 INFO [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState: SyncConnected, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 12:34:34.225 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] 125 START: CallbackHandler 0, INVOKE /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES listener: org.apache.helix.messaging.handling.HelixTaskExecutor@1b9d313c type: CALLBACK
2025/10/13 12:34:34.226 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] CallbackHandler 0 subscribing changes listener to path: /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES, callback type: CALLBACK, event types: [NodeChildrenChanged], listener: org.apache.helix.messaging.handling.HelixTaskExecutor@1b9d313c, watchChild: false
2025/10/13 12:34:34.227 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] CallbackHandler0, Subscribing to path: /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES took: 1
2025/10/13 12:34:34.231 INFO [MessageLatencyMonitor] [ZkClient-EventThread-125-pinot-zookeeper:2181] The latency of message 89f57203-2271-4d7a-abc3-1087222fc439 is 853 ms
2025/10/13 12:34:34.246 INFO [HelixTaskExecutor] [ZkClient-EventThread-125-pinot-zookeeper:2181] Scheduling message 89f57203-2271-4d7a-abc3-1087222fc439: metric_numerical_agg_1H_REALTIME:, null->nullАндрей Морозов
10/14/2025, 6:53 AMexecutionFrameworkSpec:
name: standalone
segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner
jobType: SegmentCreationAndTarPush
inputDirURI: '/var/imports/insights_ch1_fff_seg/'
includeFileNamePattern: "glob:**/*.parquet"
outputDirURI: '/tmp/pinot-segments/insights_ch1_fff_sm'
overwriteOutput: true
pushJobSpec:
pushFileNamePattern: 'glob:**/*.tar.gz'
pushParallelism: 2
pushAttempts: 2
recordReaderSpec:
dataFormat: parquet
className: org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader
pinotFSSpecs:
- scheme: file
className: org.apache.pinot.spi.filesystem.LocalPinotFS
tableSpec:
tableName: insights_ch1_4
schemaURI: '<http://pinot-controller:9000/tables/insights_ch1_4/schema>'
tableConfigURI: '<http://pinot-controller:9000/tables/insights_ch1_4>'
pinotClusterSpecs:
- controllerURI: '<http://pinot-controller:9000>'
Made segs on mounted dir after working of job:
(screenshot)
Command for running job:
docker exec -e JAVA_OPTS="-Xms16g -Xmx40g" -it pinot-controller \
bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile /config/insights_ch1_4_job.yaml
I'm not see a log from stdout - only when it falls.
Xmx40g (when it was 24g - job failed by out of heap space).
What is wrong ?madhulika
10/14/2025, 4:07 PMSonit Rathi
10/15/2025, 4:37 AMmadhulika
10/15/2025, 3:28 PMmg
10/16/2025, 9:00 AMConsumerConfig is flagging Pinot-specific properties as unknown, likely because they are wrappers around the core Kafka properties.
Are these warnings benign and expected, or does this indicate a potential issue with our configuration style?
I'm seeking recommendations on whether we can suppress these warnings or if there's an updated configuration pattern we should use to avoid passing these metadata properties to the Kafka client.
1. Controller WARN Logs (Example)
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.decoder.class.name' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'streamType' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.consumer.type' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.broker.list' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.consumer.factory.class.name' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.
2. Relevant Table Config (streamConfigs)
{
"REALTIME": {
"tableName": "XYZ",
"tableType": "REALTIME",
"segmentsConfig": {...},
"tenants": {...},
"tableIndexConfig": {
"streamConfigs": {
"streamType": "kafka",
"stream.kafka.consumer.type": "LowLevel",
"stream.kafka.topic.name": "test.airlineStats",
"stream.kafka.broker.list": "kafka-bootstrap.kafka.svc:9093",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder",
"stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka30.KafkaConsumerFactory",
"security.protocol": "SSL",
// SSL config continues...
},
"other-configs": ...
},
"metadata": {},
"other-configs": ...
}
}
Any guidance on best practices for stream config in recent Pinot versions, or a way to silence these specific ConsumerConfig warnings, would be highly appreciated!
Thanks!Tommaso Peresson
10/16/2025, 10:55 AMАндрей Морозов
10/17/2025, 11:43 AMMustafa Shams
10/20/2025, 7:02 PMAlaa Halawani
10/22/2025, 5:47 AMschedulerWaitMs
Additional details:
• Ingestion is stopped (so no extra Kafka load)
• Increasing pinot.query.scheduler.query_runner_threads helped slightly, but performance is still slower than before the restart
• Tried both MMAP and HEAP loading modes with similar results
• I am running Pinot cluster on k8s nodes
Has anyone run into similar behavior after a restart? Any idea why it happens?
Any recommendations or configuration tips to improve performance would be much appreciatedRahul Sharma
10/22/2025, 7:56 PMupsertCompactionTask is visible, but its task configuration is empty. As a result, compaction is not working, and the number of records in my table remains the same.
Can anyone please help?
Conf:
"task": {
"taskTypeConfigsMap": {
"UpsertCompactionTask": {
"schedule": "0 */5 * ? * *",
"bufferTimePeriod": "0d",
"invalidRecordsThresholdPercent": "10",
"invalidRecordsThresholdCount": "1000"
}
}
},Krupa
10/24/2025, 11:19 AMUtsav Jain
10/29/2025, 5:15 AMRajat
10/29/2025, 10:16 AMSELECT s_id, count(*)
FROM shipmentMerged_final
GROUP BY s_id
HAVING COUNT(*) > 1
Sometimes it shows no records but sometimes it shows data with count as 2Rajat
10/29/2025, 10:49 AMSELECT COUNT(*) AS aggregate,
s_id
FROM shipmentMerged_final
WHERE o_company_id = 2449226
AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
AND o_shipping_method IN ('SR', 'SRE', 'AC')
AND o_is_return = 0
AND o_state = 0
group by 2
limit 1500
Above Query is showing:
1150 total records
But When running:
SELECT COUNT(*) AS aggregate
FROM shipmentMerged_final
WHERE o_company_id = 2449226
AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
AND o_shipping_method IN ('SR', 'SRE', 'AC')
AND o_is_return = 0
AND o_state = 0
The count is coming as:
1162Rajat
10/29/2025, 10:49 AMRashpal Singh
10/29/2025, 11:24 PMnullHandlingEnabled=true at table config level
enableColumnBasedNullHandling": true at schema level
{
"name": "notNullColumn",
"dataType": "DOUBLE",
"notNull": False
}
Still when I am querying, I am getting "0" instead of null.
How can I fix this issue where I want to see null (original value) instead of 0 in query response without adding "SET enableNullHandling=true" in my queryRahul Sharma
10/30/2025, 4:23 AMpinot_controller_numMinionSubtasksWaiting_Value and pinot_controller_numMinionSubtasksRunning_Value. However, for each task type, they always show a value of 0 even when tasks are running. Am I using the wrong metrics? Which metrics should I use to build a custom autoscaler for minions?francoisa
10/30/2025, 8:49 AMBadhusha Muhammed
10/30/2025, 4:17 PMVictor Bivolaru
10/31/2025, 1:31 PMMannoj
11/03/2025, 4:39 PMwho did what, how, from which source and at what time ?
Seems like the code base logs only the response and the type and not the request.
It will be great if request is also being logged, so audit info is fully available.
In code base : ControllerResponseFilter.java
> LOGGER.info("Handled request from {} {} {}, content-type {} status code {} {}", srcIpAddr, method, uri, contentType,
> respStatus, reasonPhrase);
If this has requestContext is also added , I believe it should add request details with payload that is initially sent by the user, or if its disabled on purpose, do you mind giving that control to log4j that enduser can choose to enable it or not.
I'm no developer 🥺, I'm trying the make sense of the code and see if it can be added .
Where I'm coming from is:
I just added a user via controller to have read,write permissions of a particular user on all tables. All I get is below.
2025/11/03 20:30:59.922 INFO [ControllerResponseFilter] [grizzly-http-server-15] Handled request from 192.168.13.1 PUT <http://test-phaseroundtoaudit11.ori.com:9000/users/dedactid_rw?component=BROKER&passwordChanged=false|http://test-phaseroundtoaudit11.ori.com:9000/users/dedactid_rw?component=BROKER&passwordChanged=false>, content-type text/plain;charset=UTF-8 status code 200 OK
2025/11/03 20:30:59.957 INFO [ControllerResponseFilter] [grizzly-http-server-14] Handled request from 192.168.13.1 GET <http://test-phaseroundtoaudit11.ori.com:9000/tables|http://test-phaseroundtoaudit11.ori.com:9000/tables>, content-type null status code 200 OK
2025/11/03 20:30:59.980 INFO [ControllerResponseFilter] [grizzly-http-server-12] Handled request from 192.168.13.1 GET <http://test-phaseroundtoaudit11.ori.com:9000/users|http://test-phaseroundtoaudit11.ori.com:9000/users>, content-type null status code 200 OK
But its missing read,write has been given my admin user to ALL/particular tables. There is further granularity missing which is crucial I believe.
Let me know your views. Thanks!!Alexander Maniates
11/03/2025, 7:10 PMRahul Sharma
11/04/2025, 10:02 AMMariusz
11/04/2025, 2:42 PMpinot.broker.instance.enableThreadCpuTimeMeasurement=true
pinot.broker.instance.enableThreadAllocatedBytesMeasurement=true
pinot.server.instance.enableThreadAllocatedBytesMeasurement=true
pinot.server.instance.enableThreadCpuTimeMeasurement=true
pinot.query.scheduler.accounting.enable.thread.memory.sampling=true
pinot.query.scheduler.accounting.enable.thread.cpu.sampling=true
pinot.query.scheduler.accounting.oom.enable.killing.query=true
pinot.query.scheduler.accounting.query.killed.metric.enabled=true
pinot.query.scheduler.accounting.oom.critical.heap.usage.ratio=0.3
pinot.query.scheduler.accounting.oom.panic.heap.usage.ratio=0.3
<http://pinot.query.scheduler.accounting.sleep.ms|pinot.query.scheduler.accounting.sleep.ms>=30
pinot.query.scheduler.accounting.oom.alarming.usage.ratio=0.3
pinot.query.scheduler.accounting.sleep.time.denominator=3
pinot.query.scheduler.accounting.min.memory.footprint.to.kill.ratio=0.01
pinot.query.scheduler.accounting.factory.name=org.apache.pinot.core.accounting.PerQueryCPUMemAccountantFactory
pinot.query.scheduler.accounting.cpu.time.based.killing.enabled=true
pinot.query.scheduler.accounting.publishing.jvm.heap.usage=true
<http://pinot.query.scheduler.accounting.cpu.time.based.killing.threshold.ms|pinot.query.scheduler.accounting.cpu.time.based.killing.threshold.ms>=1000
I have run some heavy queries to test the OOM killing feature, but I don't see any killed queries in the broker/server metrics.
SELECT accountId,countryCode,direction,day,hour,msgType,currency,topic,finalStatus,year,month,
SUM(CASE WHEN finalStatus = 'Failed' THEN 1 ELSE 0 END) AS failed_count,
SUM(CASE WHEN finalStatus = 'Delivered' THEN 1 ELSE 0 END) AS success_count,
COUNT(*) AS total_records,
COUNT(DISTINCT udrId) AS unique_udrs,
SUM(price) AS total_revenue,
AVG(price) AS avg_price,
MAX(price) AS max_price,
MIN(price) AS min_price,
SUM(CASE WHEN errorCode > 0 THEN 1 ELSE 0 END) AS error_count,
SUM(price * (CASE WHEN direction = 'Unknown' THEN 1 ELSE -1 END)) AS net_revenue
FROM
dummy_table
GROUP BY
accountId,countryCode,direction,msgType,currency,topic,finalStatus,year,month,day,hour
ORDER BY
total_revenue DESC,
avg_price DESC
LIMIT 1000000
Whenever I run this query, the server goes down, but no queries are terminated automatically.
Can you please help me to understand if I am missing any configurations or steps to enable this feature?
I did test on apachepinot/pinot:1.5.0-SNAPSHOT-9d32f376d8-20251016, size of Heap -Xms2G -Xmx2G for server and broker.Naveen
11/05/2025, 9:20 AMRajasekharan A P
11/06/2025, 7:04 AM"load_chat_messages_core_1756318894786_1758914214102_1758919671601": {
"Server_172.18.0.6_8098": "ONLINE"
}
• External View:
"load_chat_messages_core_1756318894786_1758914214102_1758919671601": {
"Server_172.18.0.6_8098": "ERROR"
}
To resolve this, I performed a reload and reset operation on the affected segments. After the reset, the segment state transitioned from ERROR to OFFLINE, allowing it to be properly reloaded.
Setup details:
• Running Pinot in Docker
• Using local storage for segment files
• Segment data is volume-mountedfrancoisa
11/06/2025, 10:41 AMVictor Bivolaru
11/07/2025, 1:09 PM"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.segment.size": "500M",
"realtime.segment.flush.threshold.time": "4h"
However, when inspecting the metadata of any of the realtime segments we can see for example:
"segment.realtime.endOffset": "67399447",
"segment.start.time": "1762424217000",
"segment.time.unit": "MILLISECONDS",
"segment.flush.threshold.size": "100000",
"segment.realtime.startOffset": "66512835",
"segment.size.in.bytes": "14018213", <====== 14MB instead of 500M
"segment.end.time": "1762426143000", <====== subtracting segment.start.time from this we get roughly 35 min
"segment.total.docs": "100000",
"segment.realtime.numReplicas": "1",
"segment.creation.time": "1762511599197",
"segment.index.version": "v3",
"segment.crc": "3704033136",
"segment.realtime.status": "DONE",Rajasekharan A P
11/10/2025, 4:44 AM