Hello all, TL;DR - With regard to the MySQL conne...
# ask-community-for-troubleshooting
d
Hello all, TL;DR - With regard to the MySQL connector, if I use CDC to capture incremental changes, will I have an issue with tables that don’t have changes within a certain period of time (e.g. table A doesn’t have a change within 7 days but the binary logs get removed after 7 days). Put another way, does CDC update the binary log reference when I sync even if the table does not have any updates? I’m new to Airbyte and looking to sync MySQL data to Snowflake. I’m actually coming from the Meltano framework so familiar with some concepts of data integration. One thing that I discovered in the underlying code for that framework is that binary log replication will only update the state when the table has a change. That is to say, if I have a table that updates very infrequently, the binary log of the last update to that table may get removed before a binary log has a new transaction. I thought I was keeping things fresh by updating everyday but have found out the hard way that many of my tables can no longer sync because the referenced binary log is now missing. I’m not sure if this is a requirement of binary log replication OR this is based on the underlying singer tap that I was using. So before I truly get started in this Airbyte, I’d like to understand if CDC incremental updates will refresh the referenced log even if the table doesn’t have any changes. Thanks all.
1
u
Hi @Danny Duncan I'm not sure of being able to answer to your question in an accurate manner and I'll ping @Subodh (Airbyte) for help 😄 . I just wanted to share that, if you decide to use full refresh for the tables with infrequent updates they will be synced in a standard way, not in CDC, so it could solve your initial worry in a way.
s
hey guys sorry for the late response, I kind of know the answer but I want to verify something before putting it here. Give me an hour or so and will update the thread.
👍 2
d
Hey @Subodh (Airbyte) thanks for looking into this. Do you have an update?
s
Alright so I carried out an experiment where I created 2 tables and was using Airbyte to sync just 1 table. I carried out the first sync and stored the state generated by Airbyte, next I made some new inserts to the table that was excluded and then again triggered the sync. There were no new records because there was no change in the included table but the binlog position saved by Airbyte after 2nd sync didnt progress as well which means the state generated after 2nd sync was the same as the one generated after first sync. So if you dont update your table which is being synced by airbyte within the binlog expiration duration, then the last saved state would no more be relevant but it will not result in a failure. We use debezium for our CDC implementation and if it realises that the last saved bin log position no longer exists in mysql server, it will trigger a full sync of the tables.
d
Thank you very much for investigating this! It’s helpful to know that if the binlog files go missing, it just triggers a full sync.