Hello, is there a document that explains the “cuto...
# troubleshooting
m
Hello, is there a document that explains the “cutoff” time in detail for the data handled by time (and pk in case)? I am asking because it seems that I ve a record that is present in both OFFLINE and REALTIME (with same “primary key”) But when I am looking for it in the final table I am not finding it at all. OFFLINE record TIME 1615891108000 (ms) — max 1615939199415 — min 1612137600000 REALTIME record TIME 1615723114000(ms) — max 1615981903000 — min 1515496517000 FINAL record not present — max 1615981930000 — min 1612137600000
m
@Jackie are there any limitations/issues with time boundary computation when time unit is millis?
m
Thank you for the doc
j
There is no limitations. @Matteo Satnero Does the OFFLINE table has the same records as the REALTIME table for the overlapping time? They should be the same in order to return the correct result
m
offline and realtime have a set of records that are overlapped during the time, in that overlap some record can have same primary key but different data and different time
in this case the pk was the same in both, the data and the time was different and the result in the “final” one was empty
j
I don't fully follow here. Because you mentioned primary key here, I assume you enabled upsert for the table? Upsert only works on realtime only table, but not hybrid table. Also, Why does the real-time records have much wider span than offline records? FYI, here are the definition of the hybrid table: https://docs.pinot.apache.org/basics/components/table#hybrid-table
m
I clarifies I ve the k just inc ase os behaviours related to the upsert, I considered the upsert is not active atm on offline part. The realtime is a set refreshed more frequently than the offline (appended on a daily basis) The realtime part has a span that can overlap in part the offline(offline is hiustorical data that day by day is filled while RT is a part that starts form a day X older than 1st time of offline) Thank you very much for the docs, i ll read them later. Have a great day
Just to be sure, the time boundary cut is happening for all the records before (OL <) /after(RT >=) 86400000 ms (24h) before last time I have on OFFLINE, right?
m
@Matteo Satnero From the code I see that for hourly tables, we go back 1 hour. For other time units (including millis) we go back 24h.
Copy code
// For HOURLY table with time unit other than DAYS, use (maxEndTime - 1 HOUR) as the time boundary; otherwise, use
    // (maxEndTime - 1 DAY)
m
Thank you very much @Mayank
👍 1