Hello :wave: We are deploying flink on Kubernetes ...
# troubleshooting
a
Hello πŸ‘‹ We are deploying flink on Kubernetes using the flink-kubernetes-operator with a FlinkDeployment. We're trying to enable local recovery accross process restarts (for our taskmanagers, with a
volumeMounts
on /tmp) which seems to require a deterministic resource id according to the documentation. Unfortunately, when we try to define a fixed
taskmanager.resource-id
in
flinkConfiguration
, this one seems overwritten :
Copy code
Starting kubernetes-taskmanager as a console application on host my-topic-name-taskmanager-1-6.
[...]
Starting Kubernetes TaskExecutor runner
[...]
Program Arguments:
--configDir
 /opt/flink/conf
-Dtaskmanager.resource-id=my-topic-name-taskmanager-1-6
-Djobmanager.memory.jvm-overhead.max=805306368b
-Djobmanager.memory.jvm-overhead.min=805306368b
[...]
Loading configuration property: process.working-dir, /tmp/workdir
Loading configuration property: taskmanager.resource-id, my_custom_resource_id
[...]
Loading dynamic configuration property: taskmanager.resource-id, my-topic-name-taskmanager-1-6
In our flinkConfiguration :
Copy code
flinkConfiguration:
    state.backend.type: "rocksdb"
    state.backend.incremental: "true"
    state.backend.local-recovery: "true"
    process.working-dir: "/tmp/workdir"
    taskmanager.resource-id: "my_custom_resource_id"
   [...]
process.working-dir: "/tmp/workdir"
is interpreted correctly for instance, since we do see the
workdir
folder on the volume linked to our taskmanager. But inside, the folder name is not deterministic e.g.
tm_my-topic-name-taskmanager-1-6
Is it expected that our
taskmanager.resource-id
property is overwritten like this ? Is there a way to enable local recovery across process restarts with a volume on k8S using the flink-kubernetes-operator ? Thanks !