Hi All, I am running the Flink job in a standalone...
# troubleshooting
c
Hi All, I am running the Flink job in a standalone session mode in kubernetes cluster and trying to submit a flink job using below command, ./bin/flink run -m flink-jobmanager:6123 -c JobClassName /opt/flink/usrlib/myjob-0.0.1.jar However from my Job-manager pod logs I am getting an error and job submission is getting failed. Remote connection to [/xxx.xx.x.x:34120] failed with org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame length exceeds 1073741824: 1347375960 - discarded Any Idea how we can fix this issue, or is there any alternate way to submit the job to a session cluster.
d
increase the akka.framesize setting in your flink config. See flink-conf.yaml
if you have direct access to it modify (keeping in mind memory constraints)
Copy code
akka {
  framesize: 209715200b # this sets the frame size to 200MB, adjust according to your needs
}
command line arg
Copy code
./bin/job-manager.sh start \
  -Dakka.framesize=209715200b
or as env variable in K8 deployment
Copy code
env:
- name: FLINK_ENV_JAVA_OPTS
  value: "-Dakka.framesize=209715200b"
Just be careful because increasing framesize also consumes more memory and can affect network efficiency
Now if needed framesizes are just too large to accommodate reasonably then you can look at how to reduce size needed.
optimize jar packaging, split large dependencies and load dynamically instead of with job submission
remove unused or not needed jar files
You might use Flink’s UserClassLoader feature to load resources at runtime
It could be your savepoint strategy is leading to bulkier deployments and you might consider adjusting these
externalize savepoints to high throughput and low latency storage
These are all steps to reduce the need for a larger framesize but if you are just looking to get rid of the error or optimize the framesize for your current configuration then make the adjustments above to the framesize setting using the sutiable mechanism.
c
Thank you very much @D. Draco O'Brien for your reply.
As of now as a work around I am using the rest service end point to submit the job, it worked, will there be any down side with this approach. ex: ./bin/flink run -m flink-jobmanager-rest:8081 -c JobClassName /opt/flink/usrlib/myjob-0.0.1.jar Another question, is there a way using the rest API we can stop the running job and restart the same instance again, I could not find any such api from the documentation, https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-stop Also is Savepointing is mandatory to stop/cancel any flink job running in a session cluster?