Michael Mucciarone
10/22/2024, 5:54 PMuser
10/22/2024, 5:55 PMvalues.yml
file. This file allows you to set the CPU and memory limits and requests for your connector pods. Here's an example of how you can modify it:
yaml
global:
edition: "enterprise"
...
jobs:
resources:
limits:
cpu: ## e.g. 250m
memory: ## e.g. 500m
requests:
cpu: ## e.g. 75m
memory: ## e.g. 150m
In this example, you can replace the ## e.g. 250m
and ## e.g. 500m
with the CPU and memory limits you want to set, and ## e.g. 75m
and ## e.g. 150m
with the CPU and memory requests you want to set.
If your Airbyte deployment is underprovisioned, you may notice occasional 'stuck jobs' that remain in-progress for long periods, with eventual failures related to unavailable pods. Increasing job CPU and memory limits may also allow for increased sync speeds.
Also, to reduce the impact of an underprovisioned Airbyte deployment, you can place 'static' workloads (webapp
, server
, etc.) on one Kubernetes node group, while placing job-related workloads (connector pods) on a different Kubernetes node group. This ensures that UI or API availability is unlikely to be impacted by the number of concurrent syncs.
Here's an example of how you can configure Airbyte Self-Managed Enterprise to run in two node groups:
yaml
airbyte-bootloader:
nodeSelector:
type: static
server:
nodeSelector:
type: static
keycloak:
nodeSelector:
type: static
keycloak-setup:
nodeSelector:
type: static
temporal:
nodeSelector:
type: static
webapp:
nodeSelector:
type: static
worker:
nodeSelector:
type: jobs
workload-launcher:
nodeSelector:
type: static
## Pods spun up by the workload launcher will run in the 'jobs' node group.
extraEnvs:
- name: JOB_KUBE_NODE_SELECTORS
value: type=jobs
- name: SPEC_JOB_KUBE_NODE_SELECTORS
value: type=jobs
- name: CHECK_JOB_KUBE_NODE_SELECTORS
value: type=jobs
- name: DISCOVER_JOB_KUBE_NODE_SELECTORS
value: type=jobs
orchestrator:
nodeSelector:
type: jobs
workload-api-server:
nodeSelector:
type: jobs
In this example, you can replace type: static
and type: jobs
with the types of your Kubernetes node groups.
For more information, you can refer to the [Airbyte documentation on scaling](https://docs.airbyte.com/enterprise-setup/scaling-airbyte#concurrent-syncs) and the [Airbyte documentation on setting resource limits](https://docs.airbyte.com/deploying-airbyte/on-kubernetes/#setting-resource-limits).user
10/22/2024, 5:55 PMvalues.yml
file. This file allows you to set the CPU and memory limits and requests for your connector pods. Here's an example of how you can modify it: yaml global: edition: "enterprise" ... jobs: resources: limits: cpu: ## e.g. 250m memory: ## e.g. 500m requests: cpu: ## e.g. 75m memory: ## e.g. 150m
In this example, you can replace the ## e.g. 250m
and ## e.g. 500m
with the CPU and memory limits you want to set, and ## e.g. 75m
and ## e.g. 150m
with the CPU and memory requests you want to set. If your Airbyte deployment is underprovisioned, you may notice occasional 'stuck jobs' that remain in-progress for long periods, with eventual failures related to unavailable pods. Increasing job CPU and memory limits may also allow for increased sync speeds. Also, to reduce the impact of an underprovisioned Airbyte deployment, you can place 'static' workloads (webapp
, server
, etc.) on one Kubernetes node group, while placing job-related workloads (connector pods) on a different Kubernetes node group. This ensures that UI or API availability is unlikely to be impacted by the number of concurrent syncs. Here's an example of how you can configure Airbyte Self-Managed Enterprise to run in two node groups: yaml airbyte-bootloader: nodeSelector: type: static server: nodeSelector: type: static keycloak: nodeSelector: type: static keycloak-setup: nodeSelector: type: static temporal: nodeSelector: type: static webapp: nodeSelector: type: static worker: nodeSelector: type: jobs workload-launcher: nodeSelector: type: static ## Pods spun up by the workload launcher will run in the 'jobs' node group. extraEnvs: - name: JOB_KUBE_NODE_SELECTORS value: type=jobs - name: SPEC_JOB_KUBE_NODE_SELECTORS value: type=jobs - name: CHECK_JOB_KUBE_NODE_SELECTORS value: type=jobs - name: DISCOVER_JOB_KUBE_NODE_SELECTORS value: type=jobs orchestrator: nodeSelector: type: jobs workload-api-server: nodeSelector: type: jobs
In this example, you can replace type: static
and type: jobs
with the types of your Kubernetes node groups. For more information, you can refer to the [Airbyte documentation on scaling](https://docs.airbyte.com/enterprise-setup/scaling-airbyte#concurrent-syncs) and the [Airbyte documentation on setting resource limits](https://docs.airbyte.com/deploying-airbyte/on-kubernetes/#setting-resource-limits). Please note that the number of concurrent syncs running at any given time is the primary driver of increased resource usage in Airbyte. Each concurrent sync requires at least 3 additional connector pods to be running at once (orchestrator
, read
, write
). For example, 10 concurrent syncs require 30 additional pods in your namespace. Connector pods last only for the duration of a sync, and will be appended by the ID of the ongoing job. If your deployment of Airbyte is intended to run many concurrent syncs at once (e.g. an overnight backfill), you are likely to require an increased number of instances to run all syncs.Michael Mucciarone
10/22/2024, 5:57 PMuser
10/22/2024, 5:57 PMuser
10/22/2024, 5:57 PMvalues.yaml
file in your Airbyte deployment. Here's an example of how you can set the CPU and memory limits: yaml global: edition: "enterprise" ... jobs: resources: limits: cpu: ## e.g. 250m memory: ## e.g. 500m requests: cpu: ## e.g. 75m memory: ## e.g. 150m
In this example, replace ## e.g. 250m
and ## e.g. 500m
with the CPU and memory limits you want to set, and ## e.g. 75m
and ## e.g. 150m
with the CPU and memory requests you want to set. If your Airbyte deployment is underprovisioned, you may notice occasional 'stuck jobs' that remain in-progress for long periods, with eventual failures related to unavailable pods. Increasing job CPU and memory limits may also allow for increased sync speeds. For more information, you can refer to the [Airbyte documentation on scaling](https://docs.airbyte.com/enterprise-setup/scaling-airbyte#concurrent-syncs). Please note that the number of concurrent syncs running at any given time is the primary driver of increased resource usage in Airbyte. Each concurrent sync requires at least 3 additional connector pods to be running at once (orchestrator
, read
, write
). For example, 10 concurrent syncs require 30 additional pods in your namespace. Connector pods last only for the duration of a sync, and will be appended by the ID of the ongoing job. If your deployment of Airbyte is intended to run many concurrent syncs at once (e.g. an overnight backfill), you are likely to require an increased number of instances to run all syncs.Michael Mucciarone
10/22/2024, 6:00 PMuser
10/22/2024, 6:01 PMuser
10/22/2024, 6:01 PMglobal.jobs.resources
object in the values.yaml
file should indeed allow you to set the resource requests and limits for the job pods in Airbyte. If these settings are not being reflected in the actual resources allocated to the pods, it could potentially be a bug. However, it's important to note that there have been reported issues with these settings not taking effect. For instance, in [this issue](https://github.com/airbytehq/airbyte/issues/35749), users reported that the global.jobs.resources
settings for requests and limits seemed to have no effect on the job pods. In [another issue](https://github.com/airbytehq/airbyte/issues/42921), a user reported that the job resource requests/limits did not appear to be respected according to the job logs, despite the correct environment variables being set in the configuration. As a workaround, some users have suggested manually deleting each of the pods and allowing Kubernetes to re-create them, or uninstalling and reinstalling Airbyte each time new values are deployed. As for your question about the "enterprise" vs "oss" versions, the global.jobs.resources
settings should apply to both versions. The "enterprise" and "oss" labels in the values.yaml
file typically refer to different editions of Airbyte, but both should support setting resource limits and requests for job pods. If you continue to experience this issue, I would recommend raising it on the [Airbyte GitHub issues page](https://github.com/airbytehq/airbyte/issues) for further investigation by the Airbyte team.