Hi All I am running the Flink job in a standalone session mo Apache Flink #troubleshooting

Hi All, I am running the Flink job in a standalone...

Chennupati Gopal

08/06/2024, 7:09 AM

Hi All, I am running the Flink job in a standalone session mode in kubernetes cluster and trying to submit a flink job using below command, ./bin/flink run -m flink-jobmanager:6123 -c JobClassName /opt/flink/usrlib/myjob-0.0.1.jar However from my Job-manager pod logs I am getting an error and job submission is getting failed. Remote connection to [/xxx.xx.x.x:34120] failed with org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame length exceeds 1073741824: 1347375960 - discarded Any Idea how we can fix this issue, or is there any alternate way to submit the job to a session cluster.

D. Draco O'Brien

08/06/2024, 8:03 AM

increase the akka.framesize setting in your flink config. See flink-conf.yaml

D. Draco O'Brien

08/06/2024, 8:04 AM

if you have direct access to it modify (keeping in mind memory constraints)

Copy code

akka {
  framesize: 209715200b # this sets the frame size to 200MB, adjust according to your needs
}

D. Draco O'Brien

08/06/2024, 8:04 AM

command line arg

Copy code

./bin/job-manager.sh start \
  -Dakka.framesize=209715200b

D. Draco O'Brien

08/06/2024, 8:05 AM

or as env variable in K8 deployment

Copy code

env:
- name: FLINK_ENV_JAVA_OPTS
  value: "-Dakka.framesize=209715200b"

D. Draco O'Brien

08/06/2024, 8:05 AM

Just be careful because increasing framesize also consumes more memory and can affect network efficiency

D. Draco O'Brien

08/06/2024, 8:07 AM

Now if needed framesizes are just too large to accommodate reasonably then you can look at how to reduce size needed.

D. Draco O'Brien

08/06/2024, 8:09 AM

optimize jar packaging, split large dependencies and load dynamically instead of with job submission

D. Draco O'Brien

08/06/2024, 8:09 AM

remove unused or not needed jar files

D. Draco O'Brien

08/06/2024, 8:10 AM

You might use Flink’s UserClassLoader feature to load resources at runtime

D. Draco O'Brien

08/06/2024, 8:10 AM

It could be your savepoint strategy is leading to bulkier deployments and you might consider adjusting these

D. Draco O'Brien

08/06/2024, 8:11 AM

externalize savepoints to high throughput and low latency storage

D. Draco O'Brien

08/06/2024, 8:12 AM

These are all steps to reduce the need for a larger framesize but if you are just looking to get rid of the error or optimize the framesize for your current configuration then make the adjustments above to the framesize setting using the sutiable mechanism.

Chennupati Gopal

08/07/2024, 8:25 AM

Thank you very much @D. Draco O'Brien for your reply.

Chennupati Gopal

08/07/2024, 8:33 AM

As of now as a work around I am using the rest service end point to submit the job, it worked, will there be any down side with this approach. ex: ./bin/flink run -m flink-jobmanager-rest:8081 -c JobClassName /opt/flink/usrlib/myjob-0.0.1.jar Another question, is there a way using the rest API we can stop the running job and restart the same instance again, I could not find any such api from the documentation, https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-stop Also is Savepointing is mandatory to stop/cancel any flink job running in a session cluster?

Open in Slack

Previous Next