Alex Nitavsky
08/08/2024, 8:43 AMorg.apache.flink.runtime.checkpoint.FinishedTaskStateProvider$PartialFinishingNotSupportedByStateException: The vertex <_VERTEX_NAME_>) (id = 09d76bde52b3fbb1988a52ca0243c5b0) has used UnionListState, but part of its tasks has called operators' finish method.
Does anybody recall hitting a similar issue?
ThanksD. Draco O'Brien
08/08/2024, 8:49 AMD. Draco O'Brien
08/08/2024, 8:51 AMD. Draco O'Brien
08/08/2024, 8:52 AMD. Draco O'Brien
08/08/2024, 8:53 AMD. Draco O'Brien
08/08/2024, 8:57 AMD. Draco O'Brien
08/08/2024, 8:58 AMD. Draco O'Brien
08/08/2024, 8:59 AMD. Draco O'Brien
08/08/2024, 9:10 AMD. Draco O'Brien
08/08/2024, 9:12 AMD. Draco O'Brien
08/08/2024, 9:14 AMD. Draco O'Brien
08/08/2024, 9:15 AMD. Draco O'Brien
08/08/2024, 9:19 AMflink savepoint
to list information about savepoint data.D. Draco O'Brien
08/08/2024, 9:23 AMD. Draco O'Brien
08/08/2024, 9:28 AMAlex Nitavsky
08/13/2024, 8:01 AMstop-with-savepoint
call from the k8s operator. So we will be using checkpoints to perform releases at the end.
Meanwhile it is really feels that something is really strange with stop-with-savepoint
operator, since it allows operator to finish the before the savepoint is performed.
Since in our specific deployment have 600 TM, 2000 parallelism and 100% CPU for JM, I really suspect some thread/process race issue. We will increase amount of CPU for JM and will see if we can see the same issue again.