Edgar Ferney Ruiz Anzola
07/25/2024, 1:52 PM2024-07-25 07:29:54,586 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Initializing job 'pot' (bfadda0ae94f23951224bb498106d4cf).
2024-07-25 07:29:54,604 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Using restart back off time strategy FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=2147483647, backoffTimeMS=1000) for pot (bfadda0ae94f23951224bb498106d4cf).
2024-07-25 07:29:54,785 INFO org.apache.flink.runtime.checkpoint.DefaultCompletedCheckpointStoreUtils [] - Recovering checkpoints from KubernetesStateHandleStore{configMapName='pot-flink-deployment-bfadda0ae94f23951224bb498106d4cf-config-map'}.
2024-07-25 07:29:54,793 INFO org.apache.flink.runtime.checkpoint.DefaultCompletedCheckpointStoreUtils [] - Found 0 checkpoints in KubernetesStateHandleStore{configMapName='pot-flink-deployment-bfadda0ae94f23951224bb498106d4cf-config-map'}.
2024-07-25 07:29:54,793 INFO org.apache.flink.runtime.checkpoint.DefaultCompletedCheckpointStoreUtils [] - Trying to fetch 0 checkpoints from storage.
2024-07-25 07:29:54,892 INFO org.apache.flink.kubernetes.kubeclient.resources.KubernetesConfigMapSharedInformer [] - Starting to watch for pot/
and the fun part is sometimes it works, A few times it finds the checkpoint and restores correctly.
if I check the configmap pot-flink-deployment-bfadda0ae94f23951224bb498106d4cf-config-map
the checkpoint information is correct there and the folder is correctly stored in the gcp bucket
hope anyone can guide me to solve this problem