Davit Gachechiladze
02/10/2025, 3:09 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
to avoid memory exceptions ?kapa.ai
02/10/2025, 3:09 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
depends on your specific workload and the size of the data you're processing. However, I can provide some general guidance based on the information from the knowledge sources: 1. The default behavior for Java connectors is to use up to 1/4 of the host's allocated memory. For example, on a system with 8GB of RAM, a Java connector would limit itself to 2GB of RAM [Scaling Airbyte | Airbyte Documentation](https://docs.airbyte.com/operator-guides/scaling-airbyte#what-to-scale). 2. For database sources with large row sizes, you might need more memory. For instance, if you have a table with an average row size of 0.5MB, you would need about 5GB of RAM (0.5 * 10000 / 1000 = 5GB) [Scaling Airbyte | Airbyte Documentation](https://docs.airbyte.com/operator-guides/scaling-airbyte#what-to-scale). 3. One user reported success by setting JOB_MAIN_CONTAINER_MEMORY_REQUEST
to "4Gi" and JOB_MAIN_CONTAINER_MEMORY_LIMIT
to "6Gi" [Github Issue](https://github.com/airbytehq/airbyte/issues/52075). 4. It's important to note that you may need to set memory requests and limits for different job types separately. For example, you might need to set CHECK_JOB_MAIN_CONTAINER_MEMORY_REQUEST
, DISCOVER_JOB_MAIN_CONTAINER_MEMORY_REQUEST
, and others [Github Issue](https://github.com/airbytehq/airbyte/issues/48816). Given this information, a good starting point might be to set JOB_MAIN_CONTAINER_MEMORY_REQUEST
to "4Gi" or "6Gi", depending on your data size and complexity. However, you may need to adjust this based on your specific use case and monitor your system's performance to find the optimal value. Remember to also set the corresponding JOB_MAIN_CONTAINER_MEMORY_LIMIT
to a slightly higher value to allow for some flexibility in memory usage.Davit Gachechiladze
02/10/2025, 3:11 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
and limits: memorykapa.ai
02/10/2025, 3:11 PMkapa.ai
02/10/2025, 3:12 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
and JOB_MAIN_CONTAINER_MEMORY_LIMIT
. Here's an example: JOB_MAIN_CONTAINER_MEMORY_REQUEST=4Gi JOB_MAIN_CONTAINER_MEMORY_LIMIT=6Gi
In this example: - JOB_MAIN_CONTAINER_MEMORY_REQUEST
is set to 4 gigabytes (4Gi) - JOB_MAIN_CONTAINER_MEMORY_LIMIT
is set to 6 gigabytes (6Gi) This configuration was used successfully by a user as reported in the [Github Issue](https://github.com/airbytehq/airbyte/issues/33693). It's important to note that these values should be adjusted based on your specific workload and data size. The memory request (4Gi in this case) sets the minimum amount of memory that will be allocated to the job container, while the memory limit (6Gi) sets the maximum amount of memory the container can use. Remember that for Java connectors, which include all source database connectors, you might need to specify more memory than you expect due to Java's default behavior of using only up to 1/4 of the host's allocated memory [Scaling Airbyte | Airbyte Documentation](https://docs.airbyte.com/operator-guides/scaling-airbyte#what-to-scale). If you're dealing with large datasets or experiencing out-of-memory errors, you might need to increase these values further. Always monitor your system's performance and adjust these settings as necessary.Davit Gachechiladze
02/10/2025, 3:15 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
kapa.ai
02/10/2025, 3:15 PMkapa.ai
02/10/2025, 3:15 PMvalues.yaml
file that includes the JOB_MAIN_CONTAINER_MEMORY_REQUEST
setting. Here's an example: yaml global: env_vars: JOB_MAIN_CONTAINER_MEMORY_REQUEST: 4Gi JOB_MAIN_CONTAINER_MEMORY_LIMIT: 6Gi
This example is derived from the [Github Issue](https://github.com/airbytehq/airbyte/issues/48816) where a user shared their configuration for solving memory-related issues. It's important to note that you might want to include other memory-related settings for different job types as well. For a more comprehensive configuration, you could expand the values.yaml
file like this: yaml global: env_vars: JOB_MAIN_CONTAINER_MEMORY_REQUEST: 4Gi JOB_MAIN_CONTAINER_MEMORY_LIMIT: 6Gi CHECK_JOB_MAIN_CONTAINER_MEMORY_REQUEST: 4Gi CHECK_JOB_MAIN_CONTAINER_MEMORY_LIMIT: 6Gi DISCOVER_JOB_MAIN_CONTAINER_MEMORY_REQUEST: 4Gi DISCOVER_JOB_MAIN_CONTAINER_MEMORY_LIMIT: 6Gi SPEC_JOB_MAIN_CONTAINER_MEMORY_REQUEST: 4Gi SPEC_JOB_MAIN_CONTAINER_MEMORY_LIMIT: 6Gi
Remember that these values should be adjusted based on your specific needs and the size of the data you're working with. Monitor your system's performance and adjust these settings as necessary to avoid memory-related issues.Davit Gachechiladze
02/10/2025, 4:08 PMkapa.ai
02/10/2025, 4:08 PMkapa.ai
02/10/2025, 4:08 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
and JOB_MAIN_CONTAINER_MEMORY_LIMIT
settings apply to the main containers of jobs, which include source containers [Github Issue](https://github.com/airbytehq/airbyte/issues/17575). 2. These settings are particularly important for source containers, especially for database sources. As mentioned in the [Scaling Airbyte | Airbyte Documentation](https://docs.airbyte.com/operator-guides/scaling-airbyte#what-to-scale), "Within a Sync job, the main memory culprit is the Source worker. This is because the Source worker reads up to 10,000 records in memory." 3. For database sources with large row sizes, memory requirements can be significant. For example, a table with an average row size of 0.5MB could require about 5GB of RAM for the source container. 4. It's worth noting that all source database connectors are Java connectors, which means they follow Java's default behavior of using only up to 1/4 of the host's allocated memory. This is why users often need to specify more memory than expected for these containers. Therefore, setting appropriate values for JOB_MAIN_CONTAINER_MEMORY_REQUEST
and JOB_MAIN_CONTAINER_MEMORY_LIMIT
in your values.yaml
file is crucial for ensuring that source containers have sufficient memory to perform their tasks without encountering out-of-memory errors.Davit Gachechiladze
02/10/2025, 4:09 PMkapa.ai
02/10/2025, 4:09 PMkapa.ai
02/10/2025, 4:09 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
and JOB_MAIN_CONTAINER_MEMORY_LIMIT
settings apply to all main containers in Airbyte jobs, including source, destination, and orchestrator containers [Github Issue](https://github.com/airbytehq/airbyte/issues/3775). 2. In fact, one user reported that when setting JOB_MAIN_CONTAINER_MEMORY_LIMIT
, it also applied a memory limit to the Orchestrator container/pod for the sync jobs [Github Issue](https://github.com/airbytehq/airbyte/issues/3775). 3. However, it's important to note that these settings might not affect all types of job containers equally. For instance, the same issue report mentions that 'check' pods did not seem to be affected by the JOB_MAIN_CONTAINER_MEMORY_LIMIT
setting [Github Issue](https://github.com/airbytehq/airbyte/issues/3775). 4. For more granular control over different types of job containers, Airbyte provides specific environment variables for different job types. For example, there are separate variables for check jobs (CHECK_JOB_MAIN_CONTAINER_MEMORY_REQUEST
, CHECK_JOB_MAIN_CONTAINER_MEMORY_LIMIT
), discover jobs, and spec jobs [Github Issue](https://github.com/airbytehq/airbyte/issues/48816). So, while the JOB_MAIN_CONTAINER_MEMORY_*
settings do affect other containers beyond just the source containers, the exact impact may vary depending on the job type. For more precise control, you might need to use job-specific environment variables.Davit Gachechiladze
02/10/2025, 4:15 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
override job: request memory limits ?kapa.ai
02/10/2025, 4:15 PMkapa.ai
02/10/2025, 4:15 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
can override job request memory limits. Yes, JOB_MAIN_CONTAINER_MEMORY_REQUEST
can override job request memory limits, but it's important to understand the precedence of different configuration methods in Airbyte. According to the [Configuring Connector Resources](https://docs.airbyte.com/operator-guides/configuring-connector-resources) documentation, there are three ways to configure connector resource requirements, including memory, in decreasing order of precedence: 1. Connection-specific (highest precedence) 2. Connector-specific 3. Instance-wide (lowest precedence) The JOB_MAIN_CONTAINER_MEMORY_REQUEST
environment variable falls under the instance-wide configuration, which has the lowest precedence. This means that while it can set a baseline for memory requests across all jobs, it can be overridden by more specific configurations. If you set JOB_MAIN_CONTAINER_MEMORY_REQUEST
in your values.yaml
file or as an environment variable, it will override the default memory request for all job containers. However, if there are any connector-specific or connection-specific memory configurations, those will take precedence over the instance-wide setting. To ensure that your JOB_MAIN_CONTAINER_MEMORY_REQUEST
setting is applied, you should: 1. Set it in your values.yaml
file or as an environment variable. 2. Make sure there are no more specific (connector-specific or connection-specific) configurations that could override it. 3. After making changes, you may need to restart your Airbyte instance or redeploy for the changes to take effect. Remember, as mentioned in a [Github Issue](https://github.com/airbytehq/airbyte/issues/35761), you might need to run "docker compose down" and then "docker compose up" for the changes in the .env
file to be applied. If you need more fine-grained control over specific connections or connectors, you might want to consider using the connection-specific or connector-specific configuration methods described in the documentation.