Hi, I'm experiencing a strange issue today. My realtime table has bucketTimePeriod of 1d and bucketTimePeriod of 2d. My offline workflows are not running and I can see a message in the logs that says "Window data overflows into CONSUMING segments for partition of segments..." then "Found no eligible segments for task: RealtimeToOfflineSegmentsTask with window [1555200000 - 1641600000]. Skipping task generation...". Note these timestamps seem to be in seconds instead of milliseconds. I can see the segment.end.time and segment.start.time values are in seconds which I'm not sure of whether this was the case before. Looking through the code I can see TimeUtils compute the window using milliseconds so this is why the window spans 2 years instead of 2 days. I'm trying to figure out why this is happening now, any help is appreciated
10/04/2021, 4:11 PM
Do you mind pasting your table and task configs? I am guessing some issue with time unit. cc: @User
10/04/2021, 4:28 PM
It seems the schema was defined in microseconds and the table in milliseconds. Is there an easy way to fix this issue?
10/04/2021, 4:59 PM
we only read what is set in schema. the “timeColumnUnit” in table config is ignored (also deprecated). So schema was MICROSECONDS and data had values in millis?
it would be cleanest to just start the realtime table from scratch. Because any segments that have completed so far, will have incorrect time start/end
10/04/2021, 5:11 PM
Yup, the data contains milliseconds
I wouldn't like to loose the test data I've got because I still don't have a proper backfilling process in place. I could write a script to update the znodes but that's a bit of effort I rather spend on working on backfilling