You're not alone in encountering OOMKilled (Out Of Memory) errors when syncing large tables to a Postgres destination with Airbyte, even when resource limits and batch sizes seem reasonable. This is a known challenge, and several users have reported similar issues across different connectors and deployment setups. ### Key Points from the Knowledge Sources 1.
Postgres as a Destination Has Limitations for Large Data Volumes - The Airbyte documentation explicitly states that Postgres is not designed to be a data warehouse and is likely to perform poorly with large data volumes (e.g., >10GB). Even with increased memory and compute, large syncs can cause the destination to "lock up" or incur high memory usage, especially when using typing and deduplication features. The documentation recommends using a data warehouse like BigQuery, Snowflake, or Redshift for larger datasets. Also, scaling IOPS (disk throughput) is as important as memory and CPU for Postgres performance during large writes or updates [Postgres Troubleshooting](
https://docs.airbyte.com/integrations/destinations/postgres/postgres-troubleshooting#connector-limitations). 2.
Batching and Memory Usage - Airbyte's default behavior is to batch records in sets of 10,000, but this is based on record count, not byte size. If your rows are large, the memory required for a batch can be much higher than expected. For example, a table with an average row size of 0.5MB would require 5GB of RAM per batch. This can easily lead to OOM errors if the actual row size is underestimated or if the connector does not efficiently release memory between batches [Scaling Airbyte](
https://discuss.airbyte.io/t/3401). - There is an open issue in Airbyte's GitHub repository suggesting that batching should be based on bytes read, not just record count, to avoid OOM errors for tables with large rows. However, as of the latest information, this has not been fully implemented [Batching on Bytes Issue](
https://github.com/airbytehq/airbyte/issues/3439). 3.
Observed Behavior: Memory Usage Exceeds Data Size - Multiple users have reported that Airbyte can use 2x or more the data size in RAM during syncs, especially for large tables. This has been observed with various connectors (e.g., CockroachDB, MySQL, Postgres) and is sometimes attributed to memory leaks or inefficient memory management in the connectors [Forum Example 1](
https://discuss.airbyte.io/t/1261), [Forum Example 2](
https://github.com/airbytehq/airbyte/issues/13421). 4.
Workarounds and Recommendations - Some users have found that breaking up large tables into smaller chunks (e.g., syncing subsets via views or splitting tables) can help avoid OOM errors. - Upgrading to the latest connector versions may help, as some improvements have been made to dynamic batching and memory management, but fundamental limitations remain. - For very large tables, consider switching your destination to a data warehouse solution, as recommended by Airbyte documentation. ### Summary -
OOM errors for large tables to Postgres are a known limitation. -
Batching is by record count, not byte size, which can cause unexpectedly high memory usage. -
Allocating more memory sometimes helps, but often you need 2x or more the data size in RAM. -
Postgres is not recommended for large syncs; use a data warehouse if possible. -
No definitive fix exists as of now, but breaking up tables or using a different destination may help. If you need to stick with Postgres, you may need to continue allocating significant memory or try to split your syncs into smaller chunks. If you can, consider moving to a data warehouse destination for large datasets. If you have specific logs or error messages, sharing them may help the community or Airbyte team provide more targeted advice.