Hello is there a way to specify normalization pods resources Airbyte #feedback-and-requests

Join Slack

Hello, is there a way to specify normalization pod...

# feedback-and-requests

Maxime edfeed

01/10/2022, 7:23 PM

Hello, is there a way to specify normalization pods resources in helm ? If not, where should I look to contribute ?

Maxime edfeed

01/10/2022, 11:20 PM

I’m not sure @Maxime edfeed but looks only the main containers (server, scheduler worker) you have this control. Can I recommend to you talk to @Jonathan Stacks? 😃

Maxime edfeed

01/11/2022, 4:19 AM

I've been away from the project for a little while, but I assume normalization happens in the worker pods or the job pods that are dynamically launched. If it is the former, you can control that with the worker resources as found here: https://github.com/airbytehq/airbyte/tree/master/charts/airbyte#worker-parameters. Ex:

worker.resources.limits.cpu=3

. If it is the latter and it happens in the job pods that are dynamically spun up, you should be able to use

extraEnv

on whichever pod launches them(IIRC, its the worker pods) supplying values for the following. Example:

Copy code

worker:
  extraEnv:
  - name: JOB_MAIN_CONTAINER_CPU_REQUEST
    value: "3"
  - name: JOB_MAIN_CONTAINER_CPU_LIMIT
    value: "4"

Maxime edfeed

03/17/2022, 4:37 PM

Hello @Jonathan Stacks, I set this env vars in the scheduler, worker and server deployments. This doesn't work as expected, maybe because I'm on the v0.33.12-alpha (I use the env vars of this version, JOB_POD_MAIN_...). My workaround was to set default resources on containers using a limitranges k8s object, in my airbyte namespace. Some job pods start with the default resources, but some other jobs from different connections have other resources set. I get 700m cpu requested, this is a lot for my use case 😅. It's like airbyte had registered specific resources for some jobs or connections. Does airbyte store jobs resources in db ? Thank you 🙏

Jonathan Stacks

03/17/2022, 5:35 PM

When you say it doesn’t work as expected, is there any more information you can provide? If its an issue where the environment variable is correct, but the code launching it isn’t respecting that, it would be a code issue. It could have changed between versions and maybe the environment variable name needs to be changed. Its tough to say without more information. I’m not too familiar with whether or not it stores job resources in the DB or not as I haven’t touched most of the code, only the helm-chart

Maxime edfeed

03/17/2022, 6:09 PM

Ok, thank you ! I think the env vars are correct, the issue is that the main container in the job pods does not always use the namespace default resources. It also does not use the env vars set. For exemple, I have this env vars set in the scheduler, server and worker :

Copy code

JOB_POD_MAIN_CONTAINER_CPU_REQUEST: 20m
JOB_POD_MAIN_CONTAINER_CPU_LIMIT: 1
JOB_POD_MAIN_CONTAINER_MEMORY_REQUEST: 500Mi
JOB_POD_MAIN_CONTAINER_MEMORY_LIMIT: 2Gi

But my destination pod has container main with this resources :

Copy code

Limits:
      cpu: 1
      memory: 2Gi
    Requests:
      cpu: 700m
      memory: 1Gi

I will continue to investigate this, thank you for your help !

Jonathan Stacks

03/18/2022, 6:12 PM

What version of the helm chart are you using?

2 Views

Open in Slack

Previous Next