Hi Team I am attempting to scale my Flink application by uti Apache Flink #troubleshooting

Hi Team , I am attempting to scale my Flink applic...

Sonika Singla

05/23/2023, 5:38 AM

Hi Team , I am attempting to scale my Flink application by utilizing Horizontal Pod Autoscaling (HPA). When the usage surpasses a predefined threshold, the task manager undergoes a restart. My job involves consuming records from Hudi, performing processing operations on them, and producing the results to a Kafka topic. However, when the job restarts, it inadvertently generates duplicate records in the sink. My question is as follows: If the Flink job restarts between two checkpoints, will it reprocess the records that were already processed after the last checkpoint? Furthermore, if a savepoint is utilized, which also includes a time interval, does that imply that the records will be reprocessed in the event of a savepoint?

Dheeraj Panangat

05/23/2023, 8:30 AM

Hi @Martijn Visser, Any thoughts ? The events processed between 2 checkpoints or between 2 savepoints undergo duplicate processing if the pod is killed and restarted. Appreciate any inputs. Thanks

Martijn Visser

05/23/2023, 8:31 AM

I don't know if Hudi supports exactly once, so I can't answer that question

Martijn Visser

05/23/2023, 8:33 AM

If it does, and you set Flink to use exactly once guarantees, you will have no lost or duplicated results. See https://nightlies.apache.org/flink/flink-docs-master/docs/learn-flink/fault_tolerance/#exactly-once-end-to-end and https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/checkpointing/

Dheeraj Panangat

05/24/2023, 4:58 AM

Thanks @Martijn Visser, checking with Apache Hudi - #general-hudi-question>

Open in Slack

Previous Next