Hello Everyone , I am ingesting realtime data via...
# troubleshooting
d
Hello Everyone , I am ingesting realtime data via kafka with 55 records and set “maxNumRecordsPerSegment” - 10 . expecting 5 segment should be generated but it is showing only i segment generated .
n
During each run, the realtimeToOffline job will create only 1 segment. Next time it runs, it will generate for the next time period
The default frequency of the job is 1hr
This is described in the How it works section of the docs about this https://docs.pinot.apache.org/operators/operating-pinot/pinot-managed-offline-flows#how-this-works
d
@Neha Pawar,I have created segment and data is ingested in to realtime table and another segment is created after running stream ingestion job but the data is not stored from realtime to offline table.first segment real time status is ‘Done’
n
it will be moved 1 day at a time. It’s going to wait until it is certain no more events will be coming for that time window. So if your first segment has data from say
Jul 15 18:00:00
to
Jul 15  22:00:00
, that means all data for
Jul 15th
is not yet received. Your next consuming segment has some data for
Jul 15th 22:00:00 onwards
. The RealtimeToOffline will create the segment for
Jul 15th
only after it sees that the completed segment has crossed that day.
d
@Neha Pawar, I have created segment using kafka on 17th July 160000 onward hourly and each segment is consumed on hourly basis and the raw data is stored . But when i checked next date 18th July , there is no data in offline data and the old segement is also deleted , because ‘buffertimeperiod’ in realtime config is 1 day .
I have attached realtime as well as offline table config. Please let me know what i am missing in config file.