magax90515
10/05/2025, 11:08 AMorg.apache.pinot:pinot-common:1.4.0 be published to maven? org.apache.pinot:pinot-java-client:1.4.0 has been published, but it depends on pinot-common which has not been published.Yeshwanth
10/07/2025, 7:10 AMjute.maxbuffer) due to the large segment metadata.
We have reviewed the official troubleshooting documentation, which suggests two primary solutions:
1. Decrease the number of segments: We cannot use rollups or further merge segments, as our current segment size is already optimized at ~300MB, and we need to maintain data granularity for our query performance.
2. Increase `jute.maxbuffer`: We view this as a last resort, as we are concerned about the potential downstream performance impacts on the ZooKeeper cluster.
Given these constraints, we have a few questions:
• What are the recommended strategies for managing ZNode size in a table with a very high segment count, beyond the two options mentioned above?
• Is there a practical or theoretical upper limit on the number of segments a single Pinot table can efficiently handle before ZK performance degrades?
• Are there alternative configurations or architectural approaches we should consider for this scenario?Gerald Bonfiglio
10/07/2025, 6:59 PMFailed to collect dependencies at org.apache.pinot:pinot-jdbc-client:jar:1.4.0:
Failed to read artifact descriptor for org.apache.pinot:pinot-jdbc-client:jar:1.4.0: The following artifacts could not be resolved: org.apache.pinot:pinot:pom:1.4.0 (absent)
Checking in Maven Central, pinot-1.4.0 doesn't seem to be there. Are their plans for pushing the remaining 1.4.0 jars to Maven Central? Are we missing something else?robert zych
10/09/2025, 4:02 PMmg
10/10/2025, 8:13 AMShubham Kumar
10/10/2025, 9:59 AMArnav
10/13/2025, 9:04 AMRANJITH KUMAR
10/13/2025, 3:26 PMRANJITH KUMAR
10/14/2025, 1:58 PMPaulc
10/21/2025, 8:57 AMShubham Kumar
10/22/2025, 4:19 AM{
"tableName": "logicalTable",
"physicalTableConfigMap": {
"user_stream_REALTIME": {},
"user_batch_OFFLINE": {}
},
"refOfflineTableName": "user_batch_OFFLINE",
"refRealtimeTableName": "user_stream_REALTIME",
"brokerTenant": "DefaultTenant",
"timeBoundaryConfig": {
"boundaryStrategy": "min",
"parameters": {
"function": "min"
}
}
}Xiang Fu
Arnav
10/28/2025, 9:46 AMSELECT * FROM table
WHERE customer_id = 1234
AND msisdn IN ( ..1000 msisdns)
Query 2:
SELECT * FROM table
WHERE customer_id = 1234
AND msisdn IN ( ..350 msisdns)
UNION ALL
SELECT * FROM append_iot_session_events
WHERE customer_id = 1234
AND msisdn IN (..350 msisdns)
UNION ALL
SELECT * FROM append_iot_session_events
WHERE customer_id = 1234
AND msisdn IN (..300 msisdns)robert zych
10/29/2025, 2:44 PMMatt Nawara
10/30/2025, 12:26 PMAll metrics must have aggregation configs.I feel like it is at the heart of what we are seeing now; in essence, you can't update the schema with a new metric, as the API says:
PUT schema response: {'code': 400, 'error': 'Invalid schema: staging_stream_st_mknaw_idle_worker_test14_sg_12. Reason: Schema is incompatible with tableConfig with name: staging_stream_st_mknaw_idle_worker_test14_sg_12_REALTIME and type: REALTIME'}
and, probably correctly, the other way around, trying to get the table update in before the schema update, also does not work
PUT table response: {'code': 400, 'error': "Invalid table config: staging_stream_st_mknaw_idle_worker_test14_sg_12 with error: The destination column 'mtr_clicks_sum' of the aggregation function must be present in the schema"}
so... is the implication that a pinot schema/table pair that has ingestion aggregation can.. never evolve? this would be unfortunate.Gerald Bonfiglio
10/30/2025, 5:08 PMmg
11/04/2025, 10:44 AMArnav
11/11/2025, 4:25 AM"task": {
"taskTypeConfigsMap": {
"UpsertCompactionTask": {
"schedule": "0 0 */4 ? * *",
"bufferTimePeriod": "1h",
"invalidRecordsThresholdPercent": "0",
"invalidRecordsThresholdCount": "1",
"validDocIdsType": "SNAPSHOT"
}
}
},
Its taking too much time how can i optimise it?Satya Mahesh
11/12/2025, 10:22 AMRANJITH KUMAR
11/14/2025, 11:05 AMSuresh PERUML
11/14/2025, 3:57 PMXiang Fu
Arnav
11/17/2025, 6:10 AMQosimjon Mamatqulov
11/18/2025, 10:51 AMSan Kumar
11/20/2025, 11:20 AMEric Wohlstadter
11/20/2025, 9:43 PMArnav
11/24/2025, 6:26 AM"task": {
"taskTypeConfigsMap": {
"UpsertCompactionTask": {
"schedule": "0 */5 * ? * *",
"bufferTimePeriod": "1h",
"invalidRecordsThresholdCount": "1",
"tableMaxNumTasks": "40",
"validDocIdsType": "SNAPSHOT"
},
"UpsertCompactMergeTask": {
"schedule": "0 0 */1 ? * *",
"bufferTimePeriod": "1m",
"maxNumSegmentsPerTask": "100",
"maxNumRecordsPerSegment": "50000000"
}
}
}Prateek Garg
11/24/2025, 12:01 PMpinot.grpc.port
I'd appreciate clarification on the following points:
• Does the gRPC port mentioned in Trino documentation refer to Pinot Server's gRPC port, or Broker's gRPC API port?
• If it refers to Pinot Server's port, do we have any mechanism to make it work for our configuration, where there are multiple server instances per machine having different gRPC ports?
Documents for Reference:
https://trino.io/docs/current/connector/pinot.html#grpc-configuration-properties
https://docs.pinot.apache.org/users/api/broker-grpc-apiRANJITH KUMAR
11/26/2025, 8:52 AMRANJITH KUMAR
11/26/2025, 8:54 AM