Utsav Jain
10/29/2025, 5:15 AMRajat
10/29/2025, 10:16 AMSELECT s_id, count(*)
FROM shipmentMerged_final
GROUP BY s_id
HAVING COUNT(*) > 1
Sometimes it shows no records but sometimes it shows data with count as 2Rajat
10/29/2025, 10:49 AMSELECT COUNT(*) AS aggregate,
s_id
FROM shipmentMerged_final
WHERE o_company_id = 2449226
AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
AND o_shipping_method IN ('SR', 'SRE', 'AC')
AND o_is_return = 0
AND o_state = 0
group by 2
limit 1500
Above Query is showing:
1150 total records
But When running:
SELECT COUNT(*) AS aggregate
FROM shipmentMerged_final
WHERE o_company_id = 2449226
AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
AND o_shipping_method IN ('SR', 'SRE', 'AC')
AND o_is_return = 0
AND o_state = 0
The count is coming as:
1162Rajat
10/29/2025, 10:49 AMRashpal Singh
10/29/2025, 11:24 PMnullHandlingEnabled=true at table config level
enableColumnBasedNullHandling": true at schema level
{
"name": "notNullColumn",
"dataType": "DOUBLE",
"notNull": False
}
Still when I am querying, I am getting "0" instead of null.
How can I fix this issue where I want to see null (original value) instead of 0 in query response without adding "SET enableNullHandling=true" in my queryRahul Sharma
10/30/2025, 4:23 AMpinot_controller_numMinionSubtasksWaiting_Value and pinot_controller_numMinionSubtasksRunning_Value. However, for each task type, they always show a value of 0 even when tasks are running. Am I using the wrong metrics? Which metrics should I use to build a custom autoscaler for minions?francoisa
10/30/2025, 8:49 AMBadhusha Muhammed
10/30/2025, 4:17 PMVictor Bivolaru
10/31/2025, 1:31 PMMannoj
11/03/2025, 4:39 PMwho did what, how, from which source and at what time ?
Seems like the code base logs only the response and the type and not the request.
It will be great if request is also being logged, so audit info is fully available.
In code base : ControllerResponseFilter.java
> LOGGER.info("Handled request from {} {} {}, content-type {} status code {} {}", srcIpAddr, method, uri, contentType,
> respStatus, reasonPhrase);
If this has requestContext is also added , I believe it should add request details with payload that is initially sent by the user, or if its disabled on purpose, do you mind giving that control to log4j that enduser can choose to enable it or not.
I'm no developer 🥺, I'm trying the make sense of the code and see if it can be added .
Where I'm coming from is:
I just added a user via controller to have read,write permissions of a particular user on all tables. All I get is below.
2025/11/03 20:30:59.922 INFO [ControllerResponseFilter] [grizzly-http-server-15] Handled request from 192.168.13.1 PUT <http://test-phaseroundtoaudit11.ori.com:9000/users/dedactid_rw?component=BROKER&passwordChanged=false|http://test-phaseroundtoaudit11.ori.com:9000/users/dedactid_rw?component=BROKER&passwordChanged=false>, content-type text/plain;charset=UTF-8 status code 200 OK
2025/11/03 20:30:59.957 INFO [ControllerResponseFilter] [grizzly-http-server-14] Handled request from 192.168.13.1 GET <http://test-phaseroundtoaudit11.ori.com:9000/tables|http://test-phaseroundtoaudit11.ori.com:9000/tables>, content-type null status code 200 OK
2025/11/03 20:30:59.980 INFO [ControllerResponseFilter] [grizzly-http-server-12] Handled request from 192.168.13.1 GET <http://test-phaseroundtoaudit11.ori.com:9000/users|http://test-phaseroundtoaudit11.ori.com:9000/users>, content-type null status code 200 OK
But its missing read,write has been given my admin user to ALL/particular tables. There is further granularity missing which is crucial I believe.
Let me know your views. Thanks!!Alexander Maniates
11/03/2025, 7:10 PMRahul Sharma
11/04/2025, 10:02 AMMariusz
11/04/2025, 2:42 PMpinot.broker.instance.enableThreadCpuTimeMeasurement=true
pinot.broker.instance.enableThreadAllocatedBytesMeasurement=true
pinot.server.instance.enableThreadAllocatedBytesMeasurement=true
pinot.server.instance.enableThreadCpuTimeMeasurement=true
pinot.query.scheduler.accounting.enable.thread.memory.sampling=true
pinot.query.scheduler.accounting.enable.thread.cpu.sampling=true
pinot.query.scheduler.accounting.oom.enable.killing.query=true
pinot.query.scheduler.accounting.query.killed.metric.enabled=true
pinot.query.scheduler.accounting.oom.critical.heap.usage.ratio=0.3
pinot.query.scheduler.accounting.oom.panic.heap.usage.ratio=0.3
<http://pinot.query.scheduler.accounting.sleep.ms|pinot.query.scheduler.accounting.sleep.ms>=30
pinot.query.scheduler.accounting.oom.alarming.usage.ratio=0.3
pinot.query.scheduler.accounting.sleep.time.denominator=3
pinot.query.scheduler.accounting.min.memory.footprint.to.kill.ratio=0.01
pinot.query.scheduler.accounting.factory.name=org.apache.pinot.core.accounting.PerQueryCPUMemAccountantFactory
pinot.query.scheduler.accounting.cpu.time.based.killing.enabled=true
pinot.query.scheduler.accounting.publishing.jvm.heap.usage=true
<http://pinot.query.scheduler.accounting.cpu.time.based.killing.threshold.ms|pinot.query.scheduler.accounting.cpu.time.based.killing.threshold.ms>=1000
I have run some heavy queries to test the OOM killing feature, but I don't see any killed queries in the broker/server metrics.
SELECT accountId,countryCode,direction,day,hour,msgType,currency,topic,finalStatus,year,month,
SUM(CASE WHEN finalStatus = 'Failed' THEN 1 ELSE 0 END) AS failed_count,
SUM(CASE WHEN finalStatus = 'Delivered' THEN 1 ELSE 0 END) AS success_count,
COUNT(*) AS total_records,
COUNT(DISTINCT udrId) AS unique_udrs,
SUM(price) AS total_revenue,
AVG(price) AS avg_price,
MAX(price) AS max_price,
MIN(price) AS min_price,
SUM(CASE WHEN errorCode > 0 THEN 1 ELSE 0 END) AS error_count,
SUM(price * (CASE WHEN direction = 'Unknown' THEN 1 ELSE -1 END)) AS net_revenue
FROM
dummy_table
GROUP BY
accountId,countryCode,direction,msgType,currency,topic,finalStatus,year,month,day,hour
ORDER BY
total_revenue DESC,
avg_price DESC
LIMIT 1000000
Whenever I run this query, the server goes down, but no queries are terminated automatically.
Can you please help me to understand if I am missing any configurations or steps to enable this feature?
I did test on apachepinot/pinot:1.5.0-SNAPSHOT-9d32f376d8-20251016, size of Heap -Xms2G -Xmx2G for server and broker.Naveen
11/05/2025, 9:20 AMRajasekharan A P
11/06/2025, 7:04 AM"load_chat_messages_core_1756318894786_1758914214102_1758919671601": {
"Server_172.18.0.6_8098": "ONLINE"
}
• External View:
"load_chat_messages_core_1756318894786_1758914214102_1758919671601": {
"Server_172.18.0.6_8098": "ERROR"
}
To resolve this, I performed a reload and reset operation on the affected segments. After the reset, the segment state transitioned from ERROR to OFFLINE, allowing it to be properly reloaded.
Setup details:
• Running Pinot in Docker
• Using local storage for segment files
• Segment data is volume-mountedfrancoisa
11/06/2025, 10:41 AMVictor Bivolaru
11/07/2025, 1:09 PM"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.segment.size": "500M",
"realtime.segment.flush.threshold.time": "4h"
However, when inspecting the metadata of any of the realtime segments we can see for example:
"segment.realtime.endOffset": "67399447",
"segment.start.time": "1762424217000",
"segment.time.unit": "MILLISECONDS",
"segment.flush.threshold.size": "100000",
"segment.realtime.startOffset": "66512835",
"segment.size.in.bytes": "14018213", <====== 14MB instead of 500M
"segment.end.time": "1762426143000", <====== subtracting segment.start.time from this we get roughly 35 min
"segment.total.docs": "100000",
"segment.realtime.numReplicas": "1",
"segment.creation.time": "1762511599197",
"segment.index.version": "v3",
"segment.crc": "3704033136",
"segment.realtime.status": "DONE",Rajasekharan A P
11/10/2025, 4:44 AMRajasekharan A P
11/11/2025, 12:32 PMcluster.tenant.isolation.enable=false
Should it go inside controller.conf or pinot-controller.conf?
I’m running Pinot in Docker, and for the controller service, my command looks like this:
command: "StartController -zkAddress pinot-zookeeper:2181 -configFileName /opt/pinot/conf/pinot-controller.conf"
I added the configuration in pinot-controller.conf, but the controller container is failing to start.mathew
11/12/2025, 4:55 AMSatya Mahesh
11/12/2025, 10:34 AMRashpal Singh
11/12/2025, 5:26 PMSrinivasan Duraiswamy
11/13/2025, 2:33 AMRajat
11/13/2025, 6:10 AMMilind Chaudhary
11/13/2025, 6:27 AMAashiq PS
11/13/2025, 8:02 AMpinotAuth:
enabled: true
controllerFactoryClass: org.apache.pinot.controller.api.access.BasicAuthAccessControlFactory
brokerFactoryClass: org.apache.pinot.broker.broker.BasicAuthAccessControlFactory
configs:
- access.control.principals=admin
- access.control.principals.admin.password=<password>
error
org.apache.pinot.common.exception.HttpErrorStatusException: Got error status code: 401 (Unauthorized) with reason: "HTTP 401 Unauthorized" while sending request: /segmentConsumed?reason=forceCommitMessageReceived&streamPartitionMsgOffset=214683&instance=Server_prod-pinot-server-0.prod-pinot-server-headless.prod-pinot.svc.cluster.local_8098&name=views__0__0__20251113T0646Z&rowCount=21&memoryUsedBytes=84567768 to controller: prod-pinot-controller-2.prod-pinot-controller-headless.prod-pinot.svc.cluster.local, version: UnknownVeerendra
11/13/2025, 9:03 AM2025-11-12 16:11:54.373 ERROR [sample__121__287__20251105T1324Z] LLRealtimeSegmentDataManager_sample__66__339__20251105T1654Z - Could not send request <http://pinot-controller-01.local:9000/segmentUpload?segmentSizeBytes=1073117101&buildTimeMillis=91591&streamPartitionMsgOffset=10967889843&instance=pinot-server-01.local_8098&offset=-1&name=sample__121__287__20251105T1324Z&rowCount=73431535&memoryUsedBytes=2015467107>
org.apache.pinot.shaded.org.apache.http.conn.HttpHostConnectException: Connect to pinot-controller-01.local:9000 [pinot-controller-01.local/10.10.10.56] failed: Connection refused (Connection refused)
at org.apache.pinot.shaded.org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]
at org.apache.pinot.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) ~[pinot-all-1.0.0-jar-with-dependencies.jar:1.0.0-b6bdf6c9686b286a149d2d1aea4a385ee98f3e79]Victor Bivolaru
11/14/2025, 10:20 AMRealtimeToOfflineSegmentsTask (as opposed to MergeRollupTask where this is not mentioned) keeps the data sorted in the segment it builds.
If this were the case I would change the configs in order to seal smaller segments more frequently and let the minion bunch all these small segments up and create the larger, sorted segment to the offline table.
My question is if there is any way of validating that the data inside of the offline segments does indeed keep the data sorted by the columnsuraj sheshadri
11/20/2025, 2:19 AMNeeraja Sridharan
11/20/2025, 4:49 AMoffline tables.
Appreciate any help in confirming if the same applies to real-time tables as well 🙇♀️ Here is the associated reference for Kafka stream, but it doesn't explicitly mention if partition-based segment pruning can be set up for multiple columns in the corresponding Pinot table config: https://docs.pinot.apache.org/basics/getting-started/frequent-questions/ingestion-faq#how-do-i-enable-partitioning-in-pinot-when-using-kafka-stream
I guess the pre-requisite is: Input Kafka stream needs to be configured with a custom partitioner to match the partition column(s), partition function and number of partitions set up in the Pinot table config.