Phoebe Yang
02/04/2022, 12:22 AMLiren Tu (Airbyte)
02/04/2022, 4:59 AM• Can Airbyte handle big load of data migration ~1TB (from Heroku Postgres to AWS Postgres), and what’s the best way to optimize the migration?Currently I’m afraid the Postgres to Postgres connection won’t handle this load well. If the 1TB data is all getting synced in one connection, I think it will probably take a few days, and also likely to get stuck along the way. If the 1TB data is distributed across multiple schemas and tables, and multiple connections are set up to sync a subset of them, it may work. However, I don’t think the performance will match your requirement. Our initial benchmark results showed that it can take ~2 hours at least to sync 10GB data. So unfortunately it is pretty much infeasible to transfer 1TB data reliably at this moment.
• How the sync handles schema changes/updates after the initial full refresh?This is a very important topic we have not had bandwidth to address yet. Currently a schema change requires a reset on the destination side, i.e. purging the data. So again, it does not work for your use case.
Gerardo Santacruz
02/04/2022, 1:44 PMAugustin Lafanechere (Airbyte)
02/04/2022, 2:51 PMPhoebe Yang
02/04/2022, 3:41 PMpg_dump
for the data migration/full refresh, and configure airbyte to do incremental refresh only on new data? From the doc it seems like the first sync is always a full refresh, if thats configurable it’d work tooLiren Tu (Airbyte)
02/04/2022, 4:52 PMDo you think it’s possible to useYes, this is something we have thought about, but have not had the time to try yet.for the data migration/full refresh, and configure airbyte to do incremental refresh only on new data?pg_dump
Augustin Lafanechere (Airbyte)
02/04/2022, 4:54 PMpg_dump
will require you to manually set the state of your connection to the latest cursor value you loaded with pg_dump
. This is done by running some SQL queries against Airbyte database's state
table. Am I right @Liren Tu (Airbyte)?Liren Tu (Airbyte)
02/04/2022, 5:07 PMPhoebe Yang
02/04/2022, 6:05 PM