Lauri Suurväli
10/03/2023, 7:02 AMKafka Source
-> Process Function
-> Kafka Sink
. The jobs are stateless and are just keeping track of the offsets for Kafka in order to make progress in processing the stream. Everything seems to run fine when the job has something to process (from a few thousand to tens of millions of events per day). However if there are no messages for a few days then the TaskManager heap usage seems to increase until the job crashes. There don’t seem to be any logs that might indicate what the problem is. Once the TaskManager heap usage is too high a SIGTERM signal is received and the shutdown is triggered. Any idea what the problem might be here? Any sort of clues or resources to memory leak issues with Flink or Flink running on EMR would be really appreciated!Lauri Suurväli
10/09/2023, 10:13 AMSynchronizedSortedMap
offsetsToCommit
might be the root of the problem. When does this offsetsToCommit get emptied when there is nothing to commit? As was described earlier this problem occurs when there is no incoming data. However our jobs checkpoint quite often (every 5 seconds), so I’m wondering if some sort of an empty object is added to offsetsToCommit during every checkpoint, but they are never emptied because we don’t receive any messages so there’s no reason to commit and empty this collection.
In order to test this theory I produced 1 message to the Kafka topic that the KafkaSourceReader is consuming and as a result the heap memory was cleared.Lauri Suurväli
10/09/2023, 11:17 AMLauri Suurväli
10/09/2023, 11:20 AMnotifyCheckpointComplete
, not sure why it’s not happeningMartijn Visser
10/09/2023, 11:20 AMLauri Suurväli
10/09/2023, 11:20 AMMartijn Visser
10/09/2023, 11:21 AMLauri Suurväli
10/09/2023, 11:24 AMorg.apache.flink:flink-connector-kafka_2.12:1.14.2
and
org.apache.flink:flink-connector-kafka:1.17.0
Martijn Visser
10/09/2023, 11:26 AMMartijn Visser
10/09/2023, 11:26 AMLauri Suurväli
10/09/2023, 12:58 PMMartijn Visser
10/09/2023, 1:29 PMLauri Suurväli
10/10/2023, 1:31 PMTzu-Li (Gordon) Tai
10/10/2023, 4:00 PMTzu-Li (Gordon) Tai
10/10/2023, 7:10 PMTzu-Li (Gordon) Tai
10/10/2023, 7:23 PMLauri Suurväli
10/11/2023, 6:58 AM