Huib
10/18/2023, 8:09 AMleast(eventTime, kafkaTime) as eventTime
-> Kafka -> Flink (with watermark on the new eventTime
)
Having to go to and from Kafka adds quite some complexity and overhead to the pipeline, and we’d really like to keep the data in Flink instead. Is there a way to do this?Martijn Visser
10/18/2023, 8:12 AMHuib
10/18/2023, 8:26 AMeventTime (Iot) | timestamp (kafka)
=========================================
2023-01-01T01:01:00 | 2023-01-01T01:00:00 < record from the future
2023-01-01T01:00:00 | 2023-01-01T01:00:01
2023-01-01T00:59:00 | 2023-01-01T01:00:02 < late record
Of course we could just filter the records ourselves (without relying on the watermarks) by looking at the kafka time vs event time, but we’d then lose some of the nice properties of the watermarks + windowsMartijn Visser
10/18/2023, 8:44 AMHuib
10/18/2023, 8:46 AMRootedLabs
11/04/2023, 5:40 PM