Hello community I m making an internal modification to the D DataHub #all-things-deployment

Hello community! I'm making an internal modificati...

important-pager-98358

07/28/2023, 1:14 PM

Hello community! I'm making an internal modification to the DataHub chart to enable HPA for standalone consumers, GMS and Frontend. I saw that the GMS can use two caching technologies but one of them is only used when there is more than one replica. Can I use hazelcast even though I only have a single GMS replica? Is there any other problem that HPA can bring about by autoscaling the components? Everything going well, I can later make this contribution to the community.

brainy-tent-14503

07/28/2023, 6:22 PM

Nice! Hazelcast can be used with one replica, we just didn’t want to cause confusion as well as deal with the docker-compose settings for quickstart

brainy-tent-14503

07/28/2023, 6:23 PM

Note that the parallelization limit for the consumers is equal to the # of partitions in the kafka topics. So scaling replicas of the consumers will not work unless the # replicas <= # of topic partitions

brainy-tent-14503

07/28/2023, 6:24 PM

I think particularly for the consumers, the challenge for HPA will be that the operations are typically not memory or cpu bound but i/o bound either in reading or writing.

brainy-tent-14503

07/28/2023, 6:26 PM

Please share what you find out, I think GMS can benefit the most from HPA

important-pager-98358

07/28/2023, 7:13 PM

Thanks for the feedback @brainy-tent-14503. We are thinking for the frontend to do a scaling based on ingress traffic and for gms based on resources. As for consumers, we are evaluating exporting kafka metrics through prometheus and create a custom metric based on topic lag. With this, we should be able to scale consumers at times such as high parallelism of ingestions or execution of index restoration job

important-pager-98358

07/28/2023, 7:15 PM

I'm going to develop here internally, and everything working as expected (passing the stress tests) I intend to make a contribution to the community chart.

Open in Slack

Previous Next