I'm trying to load a historical table with 400 mil...
# ask-ai
t
I'm trying to load a historical table with 400 million records and I receive the error below regarding lack of resources. How can I resolve this? How do I configure batch size for the Oracle source? Airbyte could not start the sync process. This may be due to insufficient system resources. ResourceRequirements(claims=[], limits={memory=30Gi, cpu=8}, requests={memory=30Gi, cpu=8}
u
Based on the information provided in the knowledge sources, I can offer some insights to help resolve your issue with loading a large historical table from an Oracle source in Airbyte. The error you're encountering suggests a resource limitation problem, which is common when dealing with high-volume data loads. To address this issue, you can try the following approaches: 1. Increase memory allocation: The error message indicates that you're already using 30Gi of memory, which is quite substantial. However, for very large datasets, you might need to allocate even more memory. You can adjust the memory allocation by setting the following environment variables [as mentioned in the Airbyte documentation](https://docs.airbyte.com/operator-guides/scaling-airbyte#memory):
Copy code
JOB_MAIN_CONTAINER_MEMORY_REQUEST    JOB_MAIN_CONTAINER_MEMORY_LIMIT
Increase these values to allocate more memory to the sync job. 2. Adjust CPU allocation: Similarly, you can increase the CPU allocation if needed using these environment variables:
Copy code
JOB_MAIN_CONTAINER_CPU_REQUEST    JOB_MAIN_CONTAINER_CPU_LIMIT
3. Consider scaling your infrastructure: If you're running Airbyte on a VM or Kubernetes cluster, you might need to scale up your resources. As [mentioned in a forum post](https://discuss.airbyte.io/t/861), increasing the overall memory of your Airbyte instance can help with large syncs. Regarding configuring batch size for the Oracle source, unfortunately, the knowledge sources don't provide specific information about configuring batch size for the Oracle connector. However, it's worth noting that Airbyte has implemented some improvements in this area. As [mentioned in a forum post](https://discuss.airbyte.io/t/1168), recent versions of Airbyte connectors use a dynamic batch fetch system, which should help with performance for large datasets. If these suggestions don't resolve your issue, you might want to consider: 1. Breaking down your sync into smaller chunks, perhaps by using incremental sync if possible. 2. Checking for any potential bottlenecks in your Oracle database configuration. 3. Reaching out to Airbyte support or creating a GitHub issue for further investigation, as large-scale performance issues might require more in-depth troubleshooting. Remember, syncing very large datasets can be challenging and may require fine-tuning of both Airbyte and your infrastructure configuration.