Raghavendra M
07/31/2025, 4:46 AMAman Satya
07/31/2025, 9:46 AMShubham Kumar
07/31/2025, 5:35 PM.txt
?
• columns.psf
• creation.meta
• validdocids.bitmap.snapshot
• ttl.watermark.partition.0
Additionally, I would appreciate it if you could explain the purpose of each of these filesShivam Sharma
08/01/2025, 11:12 AMSan Kumar
08/02/2025, 3:43 AMXiang Fu
San Kumar
08/05/2025, 4:22 PMXiang Fu
Mohemmad Zaid
08/06/2025, 6:30 AMspaces
is multi value column.
{
"dimensionsSplitOrder": [
"pdate"
],
"functionColumnPairs": [
"DISTINCTCOUNTHLLMV__spaces"
]
}
https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java#L1309
IMO, we can avoid this check for aggregation column.Raghavendra M
08/06/2025, 7:52 AMShubham Kumar
08/07/2025, 9:33 AMZaeem Arshad
08/08/2025, 1:01 PMPrathamesh
08/09/2025, 9:52 AMSan Kumar
08/12/2025, 3:25 AMZaeem Arshad
08/12/2025, 3:47 AMArnav
08/12/2025, 4:23 AMarnavshi
08/12/2025, 7:05 AMForbidden: updates to statefulset spec for fields other than \'replicas\', \'ordinals\', \'template\', \'updateStrategy\', \'persistentVolumeClaimRetentionPolicy\' and \'minReadySeconds\' are forbidden\n'
While I understand that this is a Kubernetes issue/limitation, I wanted your guidance on what can be done to resolve this.San Kumar
08/12/2025, 11:09 AMam_developer
08/12/2025, 11:31 AMAbdulaziz Alqahtani
08/14/2025, 11:02 AMavailabilityLagMsMap
from /consumingSegmentsInfo
→ reports ~200–400 ms for me.
• endToEndRealtimeIngestionDelayMs
from Prometheus → shows a “saw-tooth” pattern, peaking around 5 seconds.
Can someone explain the difference between these two metrics, why they report different values, and whether the saw-tooth pattern is expected?Idlan Amran
08/18/2025, 2:38 AMprofile
, each JSON will have around 5M rows so it will have consistent JSON and segment size:
SELECT shop, svid, spid, type, profile, "key", message, product,
CAST(MAX(created_at) AS TIMESTAMP) AS created_at,
ARRAY_AGG(product_log, 'STRING', TRUE) AS product_log
FROM product_tracking
WHERE profile = {profile}
AND created_at >= CAST(DATE_TRUNC('DAY', timestampAdd(DAY,{-lookback_days},NOW()), 'MILLISECONDS','GMT-04:00') AS TIMESTAMP)
AND created_at < CAST(DATE_TRUNC('DAY', timestampAdd(DAY,0,NOW()), 'MILLISECONDS','GMT-04:00') AS TIMESTAMP)
GROUP BY shop, svid, spid, type, profile, "key", message, product
LIMIT 999999999
need help for any insights/feedback from other Pinot OSS users, thanks.Rishabh Sharma
08/18/2025, 12:37 PMSan Kumar
08/19/2025, 5:28 AMSan Kumar
08/19/2025, 5:54 AMkranthi kumar
08/19/2025, 1:29 PMMilind Chaudhary
08/20/2025, 5:49 AMIndira Vashisth
08/21/2025, 12:52 PMShubham Kumar
08/21/2025, 1:00 PMtar.gz
, such as zstd or Snappy?
2. I created an index on a column (col1
) and ingested data. Suppose a segment contains 50 records, and I run a query with the condition col1 = 'xyz'
. In this case, does Pinot load the entire segment into memory and then filter the records, or does it directly fetch only the matching data from the segment?Sandeep R
08/25/2025, 11:36 PMJan Siekierski
08/27/2025, 11:33 AM