Hi Is it possible to scale source like kafka by using below Apache Flink #troubleshooting

Hi Is it possible to scale source like kafka by us...

Mahesh Gupta

09/10/2024, 4:44 AM

Hi Is it possible to scale source like kafka by using below configs. I am able to scale up the process function like map, bucket_writer However not able to scale the source. Flink 18 below is the config for autoscaler.

Copy code

kubernetes.operator.job.autoscaler.enabled: "true"
  kubernetes.operator.job.autoscaler.stabilization.interval: 1m
  kubernetes.operator.job.autoscaler.metrics.window: 3m
  kubernetes.operator.job.autoscaler.target.utilization: "0.3"
  kubernetes.operator.job.autoscaler.target.utilization.boundary: "0.1"
  kubernetes.operator.job.autoscaler.restart.time: 2m
  kubernetes.operator.job.autoscaler.catch-up.duration: 5m
  pipeline.max-parallelism: "20"
  jobmanager.scheduler: adaptive
  kubernetes.operator.resource.metrics.enabled: "true"
  kubernetes.operator.resource.lifecycle.metrics.enabled: "true"
  kubernetes.operator.job.autoscaler.scaling.enabled: "true"

Process function does scale till

max-parallelism

value but not source. https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.6/docs/custom-resource/autoscaler/

➕ 2

Sachin Sharma

09/12/2024, 11:38 PM

Source like kafka can be scaled only upto the number of partitions in the kafka. If it is already at the max level then it cannot be scaled further.

Mahesh Gupta

09/13/2024, 4:55 AM

Hi @Sachin Sharma understood, however if I am starting kafka with lower number than the number of partitions then it should scale up to the number of partitions, right?

Sachin Sharma

09/16/2024, 6:56 PM

https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling?src=contextnavpagetreemode

Sachin Sharma

09/16/2024, 7:04 PM

As per the documentation - "For most sources like Kafka there is an upper bound on the parallelism based on the number of partitions. If backlog information is not available, users can set desired parallelism or target rate. We also aim at choosing a parallelism which yields an equal distribution of the splits among the subtasks."

Sai Sharath Dandi

11/18/2024, 11:29 PM

@Mahesh Gupta If you are using old kafka source, it won't be scaled

2 Views

Open in Slack

Previous Next