Hi,
I was performing stream ingestion through kafka in a standalone machine. I had 5 partitions created and hence 5 segments in pinot. The parameter "segment.flush.threshold.size" is set to 10000. When i try ingesting data with 100k records, only 50k records are available. Will the flushing of consuming segment take time to update or is 50k the upper bound for the configuration mentioned ?
m
Mark Needham
02/23/2022, 2:42 PM
do you mean you only see 50k records in pinot?
m
Mayank
02/23/2022, 2:50 PM
Size is in bytes, if you want to control rows there is s different setting
k
KISHORE B R
02/23/2022, 2:57 PM
Yes, i am able to view only 50k records
m
Mark Needham
02/23/2022, 2:58 PM
hmmm. The flushing doesn't affect whether you can view them - the flushing is only when does a new segment get created. But you should be able to see the records as soon as they are ingested from kafka
k
KISHORE B R
02/23/2022, 2:58 PM
What actually happens if a segment reaches max limit of any set parameter maybe rows,size or time?
m
Mark Needham
02/23/2022, 2:59 PM
the segment will be committed to the deep store and a new one will be created. The new one will be where any new messages are ingested
k
KISHORE B R
02/23/2022, 3:00 PM
Okay, then why am I not able to view the entire records?