I have a flink cluster( 1 master, 3 workers) runni...
# troubleshooting
z
I have a flink cluster( 1 master, 3 workers) running on my laptop. Job submission works. Tried the same cluster on a remote VM ( internal dev cloud). The cluster works on that VM too. Next, I tried to deploy this cluster on separate VMs 1 Master VM running the Job Manager 3 Worker VMs running the Task Managers Cluster comes up fine. I can see the java processes running on each VM. However, Job submission fails with CompletionException, ConnectionException: Connection Refused Checking with netstat and lsof, the job manager ports on the Master VM is running ONLY on tcp6 proto. NOT tcp (4). So even a simple ncat ping with the ipv4 address from one of the workers to the Master does not get through. Neither does an ncat ping from Master to itself on that port get through. If I run ncat on those same ports( no flink) then port pings from the workers get through. So clearly there is no reachability to Flink running for some reason with ipv6 proto only. 1. Is there a configuration option to fix this? So the jobmanagers and task managers run both ipv6 and ipv4 protos for their ports? 2. Is there documentation on how taskmanager host, bind-host and port configurations should be done in a multi- VM cluster?