Hi All! We are planning on using the setUnbounded(...
# troubleshooting
s
Hi All! We are planning on using the setUnbounded(OffsetsInitializer) in Flink Kafka connector to make sure that the Flink job stops reading after a specific timestamp offset for all partitions. I have a few questions regarding this as the documentation of Flink's behaviour is sparse. 1. I want to process data from Kafka based on timestamp offsets and would want to stop at specific timestamp partitions. This job will basically run once per day and read the data that was ingested into Kafka for that specific day and close processing once it reaches the end of day timestamp offset for all partitions. Since we have late events coming in and we want to process late events, it is imperative for us to store the state of the window aggregations for at least 3 days. Is there a way for me to take a savepoint at the end of the job and use that to restore processing the next day with only the unbounded offsets initializer changed? 2. If that is the case, is there any configuration that allows us to take a savepoint once the final offsets are reached?
m
At first I thought you should use the "Start Reading Position" and "Bounded End Position" to create a bounded Kafka job, see https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka/#start-reading-position But the "Window aggregation of at least 3 days" confuses me. What's your use case for creating a window aggregation that's this long, yet don't run unbounded jobs?
s
We use the DataStream API to aggregate the data that comes from Kafka. Since some of our jobs don't need to run continuously, we wanted to run them once per day where they will ingest data for that particular day based on timestamp, process the data and then shutdown. The data in Kafka could have late events which we need to process when the job comes up. But for that to happen, we need to maintain state.
The window size is not 3 days. The allowed lateness is 3 days
m
The allowed lateness is 3 days
Have you read the late elements considerations you need to take into account? https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#late-elements-considerations
then shutdown.
Unless you will shutdown them yourselves, you will need to create bounded Kafka jobs. To be honest, given the info, I wouldn't choose for this solution based on your use case description. I would just let my jobs run forever without shutting them down.
s
The problem is that I will have to allocate resources forever to jobs that I am okay with running once per day.
m
It’s a trade off discussion indeed: now you need to include additional engineering resources to implement a non-typical Flink use case
s
Understood! Thanks for the suggestions!