Hi there, my TaskManager is crashing intermittentl...
# troubleshooting
d
Hi there, my TaskManager is crashing intermittently and i'm struggling to pinpoint the root cause. I've checked the logs and i've noticed some Kafka client disconnection events just before the TaskManager crash (logs provided below). However, i'm not sure these are related to the crash or if there's another issue causing the problem.
Copy code
2023-09-14 15:46:23,413 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner      [] - Memory usage stats: [HEAP: 6761/8768/8768 MB, NON HEAP: 175/188/744 MB (used/committed/max)]
2023-09-14 15:46:23,413 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner      [] - Direct memory stats: Count: 2202, Total Capacity: 84977075, Used Memory: 84977076
2023-09-14 15:46:23,413 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner      [] - Off-heap pool stats: [CodeHeap 'non-nmethods': 1/2/5 MB (used/committed/max)], [Metaspace: 114/121/256 MB (used/committed/max)], [CodeHeap 'profiled nmethods': 28/31/117 MB (used/committed/max)], [Compressed Class Space: 14/16/248 MB (used/committed/max)], [CodeHeap 'non-profiled nmethods': 15/16/117 MB (used/committed/max)]
2023-09-14 15:46:23,413 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner      [] - Garbage collector stats: [G1 Young Generation, GC TIME (ms): 51132, GC COUNT: 477], [G1 Old Generation, GC TIME (ms): 0, GC COUNT: 0]
2023-09-14 15:46:23,414 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-20] Node -3 disconnected.
2023-09-14 15:46:23,416 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-20] Node -2 disconnected.
2023-09-14 15:46:23,442 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Consumer clientId=1-process-aggregator-6, groupId=1-process-aggregator] Node -4 disconnected.
2023-09-14 15:46:23,443 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Consumer clientId=1-process-aggregator-6, groupId=1-process-aggregator] Node 2147482643 disconnected.
2023-09-14 15:46:23,464 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-16] Node -3 disconnected.
2023-09-14 15:46:23,514 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-19] Node -3 disconnected.
2023-09-14 15:46:23,581 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-19] Node -2 disconnected.
2023-09-14 15:46:23,614 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-21] Node -3 disconnected.
2023-09-14 15:46:23,615 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-21] Node -2 disconnected.
2023-09-14 15:46:26,565 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-27] Node -2 disconnected.
2023-09-14 15:46:26,663 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-23] Node -2 disconnected.
2023-09-14 15:46:27,665 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-26] Node -4 disconnected.
2023-09-14 15:46:27,668 INFO  org.apache.kafka.clients.NetworkClient                       [] - [Producer clientId=producer-26] Node -2 disconnected.