What is the reason for having timeboundary based query routing for hybrid tables? If the table is configured for time based pruning then can’t we just use that to dispatch query to right segments/servers instead?
m
Mayank
06/27/2022, 11:28 PM
Because there may be overlap of data between offline and realtime.
a
Ashish
06/27/2022, 11:30 PM
Sure but why is that an issue? How is that different from multiple realtime segments have overlapping data.
m
Mayank
06/27/2022, 11:30 PM
That is not overlap, that is replication.
a
Ashish
06/27/2022, 11:31 PM
Could you please elaborate on what exactly is overlapping data and what issue it can cause?
m
Mayank
06/27/2022, 11:33 PM
Overlap example -> RT has data from now going back 7 days. Offline has data from yesterday to last 1 year. Then previous 6 days are present in both offline and realtime.
Mayank
06/27/2022, 11:34 PM
Is your question mostly for understanding, or is there an issue?