Hello I was wondering if anyone tried to deploy Fl...
# troubleshooting
m
Hello I was wondering if anyone tried to deploy Flink using Flink k8s operator on machine where OKD [1] is installed? We have tried to install Flink k8s operator version 1.6 which seems to succeed, however when we try to deploy simple Flink deployment we are getting an error.
Copy code
023-09-19 10:11:36,440 i.j.o.p.e.ReconciliationDispatcher [ERROR][flink/test] Error during event processing ExecutionScope{ resource id: ResourceID{name='test', namespace='flink'}, version: 684949788} failed.  io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PUT at: <https://172.30.0.1:443/apis/flink.apache.org/v1beta1/namespaces/flink/flinkdeployments/test>. Message: <http://FlinkDeployment.flink.apache.org|FlinkDeployment.flink.apache.org> "test" is invalid: [spec.ingress: Invalid value: "null": spec.ingress in body must be of type object: "null", spec.mode: Invalid value: "null": spec.mode in body must be of type string: "null", spec.mode: Unsupported value: "null": supported values: "native", "standalone", spec.logConfiguration: Invalid value: "null": spec.logConfiguration in body must be of type object: "null", spec.imagePullPolicy: Invalid value: "null": spec.imagePullPolicy in body must be of type string: "null", spec.jobManager.podTemplate: Invalid value: "null": spec.jobManager.podTemplate in body must be of type object: "null", spec.jobManager.resource.ephemeralStorage: Invalid value: "null": spec.jobManager.resource.ephemeralStorage in body must be of type string: "null", spec.podTemplate: Invalid value: "null": spec.podTemplate in body must be of type object: "null", spec.restartNonce: Invalid value: "null": spec.restartNonce in body must be of type integer: "null", spec.taskManager.replicas: Invalid value: "null": spec.taskManager.replicas in body must be of type integer: "null", spec.taskManager.resource.ephemeralStorage: Invalid value: "null": spec.taskManager.resource.ephemeralStorage in body must be of type string: "null", spec.taskManager.podTemplate: Invalid value: "null": spec.taskManager.podTemplate in body must be of type object: "null", spec.job: Invalid value: "null": spec.job in body must be of type object: "null", .spec.taskManager.replicas: Invalid value: 0: .spec.taskManager.replicas accessor error: <nil> is of the type <nil>, expected int64]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ingress, message=Invalid value: "null": spec.ingress in body must be of type object: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.mode, message=Invalid value: "null": spec.mode in body must be of type string: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.mode, message=Unsupported value: "null": supported values: "native", "standalone", reason=FieldValueNotSupported, additionalProperties={}), StatusCause(field=spec.logConfiguration, message=Invalid value: "null": spec.logConfiguration in body must be of type object: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.imagePullPolicy, message=Invalid value: "null": spec.imagePullPolicy in body must be of type string: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.jobManager.podTemplate, message=Invalid value: "null": spec.jobManager.podTemplate in body must be of type object: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.jobManager.resource.ephemeralStorage, message=Invalid value: "null": spec.jobManager.resource.ephemeralStorage in body must be of type string: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.podTemplate, message=Invalid value: "null": spec.podTemplate in body must be of type object: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.restartNonce, message=Invalid value: "null": spec.restartNonce in body must be of type integer: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.taskManager.replicas, message=Invalid value: "null": spec.taskManager.replicas in body must be of type integer: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.taskManager.resource.ephemeralStorage, message=Invalid value: "null": spec.taskManager.resource.ephemeralStorage in body must be of type string: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.taskManager.podTemplate, message=Invalid value: "null": spec.taskManager.podTemplate in body must be of type object: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.job, message=Invalid value: "null": spec.job in body must be of type object: "null", reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=.spec.taskManager.replicas, message=Invalid value: 0: .spec.taskManager.replicas accessor error: <nil> is of the type <nil>, expected int64, reason=FieldValueInvalid, additionalProperties={})], group=<http://flink.apache.org|flink.apache.org>, kind=FlinkDeployment, name=test, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=<http://FlinkDeployment.flink.apache.org|FlinkDeployment.flink.apache.org> "test" is invalid: [spec.ingress: Invalid value: "null": spec.ingress in body must be of type object: "null", spec.mode: Invalid value: "null": spec.mode in body must be of type string: "null", spec.mode: Unsupported value: "null": supported values: "native", "standalone", spec.logConfiguration: Invalid value: "null": spec.logConfiguration in body must be of type object: "null", spec.imagePullPolicy: Invalid value: "null": spec.imagePullPolicy in body must be of type string: "null", spec.jobManager.podTemplate: Invalid value: "null": spec.jobManager.podTemplate in body must be of type object: "null", spec.jobManager.resource.ephemeralStorage: Invalid value: "null": spec.jobManager.resource.ephemeralStorage in body must be of type string: "null", spec.podTemplate: Invalid value: "null": spec.podTemplate in body must be of type object: "null", spec.restartNonce: Invalid value: "null": spec.restartNonce in body must be of type integer: "null", spec.taskManager.replicas: Invalid value: "null": spec.taskManager.replicas in body must be of type integer: "null", spec.taskManager.resource.ephemeralStorage: Invalid value: "null": spec.taskManager.resource.ephemeralStorage in body must be of type string: "null", spec.taskManager.podTemplate: Invalid value: "null": spec.taskManager.podTemplate in body must be of type object: "null", spec.job: Invalid value: "null": spec.job in body must be of type object: "null", .spec.taskManager.replicas: Invalid value: 0: .spec.taskManager.replicas accessor error: <nil> is of the type <nil>, expected int64], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).  at io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)  at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518)  at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535)  at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleUpdate(OperationSupport.java:358)  at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleUpdate(BaseOperation.java:708)  at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.lambda$handleReplace$0(HasMetadataOperation.java:185)  at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.handleReplace(HasMetadataOperation.java:190)  at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.replace(HasMetadataOperation.java:101)  at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.replace(HasMetadataOperation.java:45)  at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher$CustomResourceFacade.updateResource(ReconciliationDispatcher.java:387)  at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.conflictRetryingUpdate(ReconciliationDispatcher.java:343)  at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.updateCustomResourceWithFinalizer(ReconciliationDispatcher.java:316)  at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:115)  at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:89)  at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:62)  at io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:414)  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)  at java.base/java.lang.Thread.run(Unknown Source)
our definition:
Copy code
apiVersion: <http://flink.apache.org/v1beta1|flink.apache.org/v1beta1>
kind: FlinkDeployment
metadata:
  namespace: flink
  name: test
spec:
  mode: native
  image: flink:1.17
  flinkVersion: v1_17
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
  serviceAccount: flink
  jobManager:
    resource:
      memory: "2048m"
      cpu: 1
  taskManager:
    resource:
      memory: "2048m"
      cpu: 1
Copy code
(2.9.12) [xxxxxx ~]$ oc get sa
NAME             SECRETS   AGE
builder          2         6d22h
default          2         6d22h
deployer         2         6d22h
flink            2         6d19h
flink-operator   2         17h  

(2.9.12) [xxxxxx~]$ oc get role
NAME    CREATED AT
flink   2023-09-13T11:53:42Z

(2.9.12) [xxxxxx~]$ oc describe role flink
Name:         flink
Labels:       <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
              <http://app.kubernetes.io/name=flink-kubernetes-operator|app.kubernetes.io/name=flink-kubernetes-operator>
              <http://app.kubernetes.io/version=1.0.1|app.kubernetes.io/version=1.0.1>
              <http://helm.sh/chart=flink-kubernetes-operator-1.6.0|helm.sh/chart=flink-kubernetes-operator-1.6.0>
Annotations:  <http://helm.sh/resource-policy|helm.sh/resource-policy>: keep
              <http://meta.helm.sh/release-name|meta.helm.sh/release-name>: flink-kubernetes-operator
              <http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: flink
PolicyRule:
  Resources         Non-Resource URLs  Resource Names  Verbs
  ---------         -----------------  --------------  -----
  configmaps             []                 []              [*]
  pods                        []                 []              [*]
  deployments.apps  []                 []              [*]

(-2.9.12) [xxxxx ~]$ oc get rolebinding
NAME                              ROLE                                          AGE
flink-role-binding                Role/flink                                    6d19h
we are using okd 4.6.0
crd
Copy code
<http://flinkdeployments.flink.apache.org|flinkdeployments.flink.apache.org>                           2023-09-13T11:53:05Z
<http://flinksessionjobs.flink.apache.org|flinksessionjobs.flink.apache.org>                           2023-09-13T11:53:05Z