Apache Pinot

Hi Pinot team, I am trying to create a realtime Pinot table ingesting the data from Kafka topic.
```1. The Kafka stream data has two time columns: processed_at and created_at. 
2. The processed_at column is in-order inside Kafka stream.
3. The created_at is out-of-order inside Kafka stream```
The retention of realtime pinot table is depending on created_at.
If we want to use created_at as timeColumnName, since created_at can be very old, a lot of stale segments can be created.
If we want to use processed_at as timeColumnName, a lot of old orders can live in the realtime table.
Do you guys have any suggestion about which one to choose as timeColumnName?

IIUC, you need all records that for which `createdAt &gt; now -R`  (where R is retention). If R is high, then you _need_ those old segments, so why are you worried about lot of stale segments? Are the values of `createdAt` so random that _all_ records ever ingested can be retained? As long as newer  records in the kafka topic have reasonable values for `createdAt`(i.e. higher than older ones), I would use createdAt as time column. If necessary, you can add a filter at the time of creating the table to drop records earlier than epoch `tableCreationTime - R` . On the other hand, if you can get createdAt values all over the map all the time, then maybe what you need is a `REFRESH` table with no time column.

<@UEESR2P24> Thanks! I gave filter a try. It works.