Hi All We are planning on using the setUnbounded OffsetsInit Apache Flink #troubleshooting

Hi All! We are planning on using the setUnbounded(...

Sameer Chandra

04/10/2023, 8:39 AM

Hi All! We are planning on using the setUnbounded(OffsetsInitializer) in Flink Kafka connector to make sure that the Flink job stops reading after a specific timestamp offset for all partitions. I have a few questions regarding this as the documentation of Flink's behaviour is sparse. 1. I want to process data from Kafka based on timestamp offsets and would want to stop at specific timestamp partitions. This job will basically run once per day and read the data that was ingested into Kafka for that specific day and close processing once it reaches the end of day timestamp offset for all partitions. Since we have late events coming in and we want to process late events, it is imperative for us to store the state of the window aggregations for at least 3 days. Is there a way for me to take a savepoint at the end of the job and use that to restore processing the next day with only the unbounded offsets initializer changed? 2. If that is the case, is there any configuration that allows us to take a savepoint once the final offsets are reached?

Martijn Visser

04/11/2023, 9:38 AM

At first I thought you should use the "Start Reading Position" and "Bounded End Position" to create a bounded Kafka job, see https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka/#start-reading-position But the "Window aggregation of at least 3 days" confuses me. What's your use case for creating a window aggregation that's this long, yet don't run unbounded jobs?

Sameer Chandra

04/11/2023, 10:02 AM

We use the DataStream API to aggregate the data that comes from Kafka. Since some of our jobs don't need to run continuously, we wanted to run them once per day where they will ingest data for that particular day based on timestamp, process the data and then shutdown. The data in Kafka could have late events which we need to process when the job comes up. But for that to happen, we need to maintain state.

Sameer Chandra

04/11/2023, 12:02 PM

The window size is not 3 days. The allowed lateness is 3 days

Martijn Visser

04/11/2023, 12:49 PM

The allowed lateness is 3 days

Have you read the late elements considerations you need to take into account? https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#late-elements-considerations

then shutdown.

Unless you will shutdown them yourselves, you will need to create bounded Kafka jobs. To be honest, given the info, I wouldn't choose for this solution based on your use case description. I would just let my jobs run forever without shutting them down.

Sameer Chandra

04/11/2023, 4:33 PM

The problem is that I will have to allocate resources forever to jobs that I am okay with running once per day.

Martijn Visser

04/11/2023, 4:45 PM

It’s a trade off discussion indeed: now you need to include additional engineering resources to implement a non-typical Flink use case

Sameer Chandra

04/12/2023, 8:48 AM

Understood! Thanks for the suggestions!

Open in Slack

Previous Next