Madhu Prabhakara

02/15/2022, 10:53 PM
Hey Everyone, so we have deployed open source version of airbyte on an ec2 instance. Now we are integrating our hubspot account with postgres DB and I see that our hubspot account has close to 4.5 GB of data which takes about 3 hours to sync in total. I see two modes to sync "Full refresh+Overwrite" and "Full rfersh + append". Which one do you think is good in this use case considering the time it takes for full data to sync...I am considering the option below... 1. Full-refresh+append I think creates duplicates which will need to be handled in the postgres for further analysis

Marcos Marx (Airbyte)

02/16/2022, 2:29 AM
You’re correct. Full Refresh Overwrite will delete the data and bring all data again, and append will keep it. Append will store the “history” of your data keeping the duplicates