I have a question, I'm using Airbyte to migrate da...
# ask-community-for-troubleshooting
p
I have a question, I'm using Airbyte to migrate data from BigQuery to Postgres, both being in the Google Cloud
The problem I have is that the memory being used by the cloud machine keeps increasing until it hits the limit, then the speed slows down to a crawl
How exactly does Airbyte move data? Would it be possible to customize it in a way to allow for lower memory usage on my Google Cloud Machine?
j
Hi Paul, (I'm not an Airbyte team member, but another user struggling with Airbyte experiencing a similar issue). Regarding the memory consumption, have you read about : • https://docs.airbyte.com/operator-guides/scaling-airbyte#memoryhttps://github.com/airbytehq/airbyte/issues/3439 It seems Airbyte will read up to 10k records, which can lead to a lot of RAM usage depending on your record size. I couldn't find how to tune this however 😕 Looking at the code (https://github.com/airbytehq/airbyte/blob/master/airbyte-commons-worker/src/main/java/io/airbyte/workers/general/DefaultReplicationWorker.java) it seems moving data is mostly an In-memory process inside a "while loop" (but I'm still not familiarized with the code base and the process it-self could depends on each source/destination implementation).
p
Thanks, @Julien Calderan, I figured it was either this issue or a logging issue on my destination DB
s
Hey Paul, Julien provided some great help with your question. Just to collect more data, how big is your sync and how much memory did you allocate for airbyte?
p
The sync is pretty big, with tables ranging from 60 MB to 70 GB, however the first table it chooses has 10 GB
First time using Airbyte I kept the company's default memory settings (around 8 GB on Google Cloud), however I ramped it up to 110 GB to see how much it will take. At some point the memory usage was around 50 GB before I reverted to the default settings
The problem is that if for a 10 GB table the memory usage goes above 50 GB (since it hadn't finished syncing), then I'm truly afraid that the 110 GB memory limit will not be enough for the 70 GB table
Is there anyway to limit the memory usage on my destination machine @Saj Dider (Airbyte)?
s
p
@Saj Dider (Airbyte) I don't think this is the same issue I'm having. I can't complete a single sync on a large DB without running out of memory
It's not as if multiple syncs are running and are stacking memory usage, it's just one sync running and piling up one continuously increasing ram usage
u
Paul today you can limit the resources for pod containers but this can lead to OOM issues