Hello, We are evaluating airbyte and I have a few ...
# feedback-and-requests
v
Hello, We are evaluating airbyte and I have a few questions before I start running a full-fledged POC. Thank you in advance for taking the time to share your thoughts and knowledge: • The big use case for us to use Airbyte is for CDC from MySQL and Postgres. I see that Airbyte is using Debezium 1.4.2 and it seems like the upgrade to the latest version is not on the radar and potentially a big lift. For context, we have multiple databases with more on the way and processing data in the range of 500 Million rows every day. ◦ I would love for some feedback on what the community's experience has been with the CDC sources in general ◦ The majority of issues I have seen reported with Debezium affecting versions >= 1.4.2 and likely to impact us are about not being able to parse the logs when there are DDL statements in the logs. Have people run into similar issues ? • I see that the PR to use a configurable backend for Secrets has been merged. Is the functionality available in the Open-Source version ? • I have not been able to confirm, but it seems like the way CDC works is that the first connection to the source will want to do a full snapshot of the data. This is a big no-no for us from the Production DB which is huge. Can the source connector be configured to only read from a given position in the transaction log (BinLog/LSN) ?
u
Hi @Vikram Bhamidipati I'll let the community speak for its feedback for CDC sources. Could you point me to the PR you're mentionning? If it's been merged it should be available in our Open source version. I confirm that the first connection will perform a full refresh. To avoid load problem on your production DB I'd suggest you to configure your Airbyte connection to target a DB replica.
u
GM @[DEPRECATED] Augustin Lafanechere, I do not have the PR handy but the Github issue for the Secrets backend is here : https://github.com/airbytehq/airbyte/issues/5921 and the issue has been closed as done.
n
As for the first connection doing a full refresh - it is a challenge for us because on a managed Service like AWS RDS, only the writer produces binlogs. Also, I assume the switch back may not work since we do not have GTID enabled.
And what I should have started this by saying. Thank you for taking the time to answer the questions.
u
Ok so I confirm that secrets storage is not in our open source version yet
u
aah - Thanks @[DEPRECATED] Augustin Lafanechere. That was my impression as well. I would like to double-click on the
yet
part of your response. Is there a plan to bring it to the open-source version and if yes, when ?