Antal Kajzer
10/02/2023, 11:05 AMCaused by: java.lang.IllegalStateException: Failed to rollback to checkpoint/savepoint Checkpoint Metadata. Max parallelism mismatch between checkpoint/savepoint state and new program. Cannot map operator ab78513f3423aaef9b9d527c6a07d3a5 with max parallelism 128 to new program with max parallelism 1. This indicates that the program has been changed in a non-compatible way after the checkpoint/savepoint.
Thats why I need to use state-processor-api, and try to set maxParallelism to 1 this way.
StateBootstrapTransformation transformation = OperatorTransformation
.bootstrapWith(bootstrapWith)
.setMaxParallelism(1)
.transform(transform);
This is failing as: OperatorSubtaskStateReducer constructor has a precondition there.
Preconditions.checkState(maxParallelism > 1);
Can you please help me understand why do we have this precheck here, and advice how I could migrate our maxParallelism = 128 checkpoints to maxParallelism = 1?Martijn Visser
10/02/2023, 11:08 AMAntal Kajzer
10/02/2023, 11:10 AMMartijn Visser
10/02/2023, 11:12 AMAntal Kajzer
10/02/2023, 11:14 AMMartijn Visser
10/02/2023, 11:15 AMAntal Kajzer
10/02/2023, 11:16 AMAntal Kajzer
10/02/2023, 11:19 AMorg.apache.flink.runtime.checkpoint.Checkpoints.loadAndValidateCheckpoint
Antal Kajzer
10/02/2023, 11:20 AMCaused by: java.lang.IllegalStateException: Failed to rollback to checkpoint/savepoint Checkpoint Metadata. Max parallelism mismatch between checkpoint/savepoint state and new program. Cannot map operator ab78513f3423aaef9b9d527c6a07d3a5 with max parallelism 128 to new program with max parallelism 1. This indicates that the program has been changed in a non-compatible way after the checkpoint/savepoint.
at org.apache.flink.runtime.checkpoint.Checkpoints.loadAndValidateCheckpoint(Checkpoints.java:190)
at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1798)
at org.apache.flink.runtime.scheduler.DefaultExecutionGraphFactory.tryRestoreExecutionGraphFromSavepoint(DefaultExecutionGraphFactory.java:214)
at org.apache.flink.runtime.scheduler.DefaultExecutionGraphFactory.createAndRestoreExecutionGraph(DefaultExecutionGraphFactory.java:189)
at org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:361)
at org.apache.flink.runtime.scheduler.SchedulerBase.<init>(SchedulerBase.java:206)
at org.apache.flink.runtime.scheduler.DefaultScheduler.<init>(DefaultScheduler.java:134)
at org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:152)
at org.apache.flink.runtime.jobmaster.DefaultSlotPoolServiceSchedulerFactory.createScheduler(DefaultSlotPoolServiceSchedulerFactory.java:119)
at org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:369)
at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:346)
at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.internalCreateJobMasterService(DefaultJobMasterServiceFactory.java:123)
at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.lambda$createJobMasterService$0(DefaultJobMasterServiceFactory.java:95)
at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedSupplier$4(FunctionUtils.java:112)
... 4 more
Martijn Visser
10/02/2023, 11:59 AMMartijn Visser
10/02/2023, 11:59 AMMartijn Visser
10/02/2023, 12:00 PMAntal Kajzer
10/02/2023, 12:27 PMAntal Kajzer
10/03/2023, 11:08 AMMartijn Visser
10/03/2023, 12:16 PMAntal Kajzer
10/03/2023, 12:20 PMAntal Kajzer
10/03/2023, 12:28 PMAntal Kajzer
10/03/2023, 2:22 PM