Pinot Upsert Question: Upsert is supported only f...
# general
j
Pinot Upsert Question: Upsert is supported only for realtime tables. That’s fine. The time column is use to determine the order of the updates to choose the latest one. What time is used to determine when to evict a row (visible or not). The documents tend to point to segment age to determine when to evict messages. In practice it seems to evict based on when the row was actually imported. What’s the expected behavior for a realtime (upsert) table?
j
If a row is not updated by another row with newer timestamp, then it will expire along with the segment containing it. The segment is expired based on the latest timestamp within the segment and the retention config
j
We currently have a convention where our rows are versioned with a number. We’re using this as our time column. Part of the reasoning for this is to ensure in the case we reprocess our flink job we won’t overwrite rows in Pinot with old data. But the ordering and the retention are both controlled by the time column, correct? Is there a good mechanism to control ordering more directly rather than relying ont he same time column ued for retention?
j
IIUC, the requirement is the same as this issue: https://github.com/apache/incubator-pinot/issues/6523?
Currently it is not supported yet