Airbyte is an open-source data integration engine that helps you consolidate your data in your data warehouses, lakes and databases.

Airbyte

Hi there,

I have been looking at using Airbyte to replicate our production Postgres database into BigQuery using CDC, however, I’ve encountered three blocking problems:

1. No way to either anonymize or not sync sensitive columns of certain tables. The recommendation of using views does not help as we need to capture deletes.
2. There is no way to add/remove streams from an existing connector without resyncing all streams (even if most do not need changing). This means that we are unable to sync our database as the initial sync of all tables takes too long if done in one go (WAL accumulates). Our Postgres database is very large. Ideally, we’d add the tables bit by bit to stop the WAL accumulating to unreasonable levels.
3. We have too many tables to realistically manage in the UI. We really need to be able to declare this ‘in code’ and mange it in version control.
Are these this are planned to be improved/added/fixed in the near future?

<@U035S5A76R0> I've been hacking around 2 and 3 by using the API and managing the timestamps directly in the Airbyte DB. But it's a pain.