<@U04SE163WC8> in OSS version, is there a way to p...
# ask-ai
k
@kapa.ai in OSS version, is there a way to prevent initial sync if data destination host is updated?
u
Based on the provided knowledge sources, there isn't a direct way to prevent an initial sync when the data destination host is updated in the Open Source Software (OSS) version of Airbyte. The first sync is typically a 'full refresh' sync that gets all the data [source](https://discuss.airbyte.io/t/3381). However, you can manage how Airbyte handles changes in the source schema. You can specify for each connection how Airbyte should handle any change of schema in the source. This process helps ensure accurate and efficient data syncs, minimizing errors and saving you time and effort in managing your data pipelines. To set or edit this setting, navigate to a connection's Settings tab [source](https://docs.airbyte.com/using-airbyte/schema-change-management). Also, when updating connections, Airbyte recommends using an updated version of Airbyte. All updates (Major, Minor, Patch) are opt-in via the settings page. Minor and Patch Updates are applied immediately and globally to all connectors of that type in your instance. Major Updates require a two-step opt-in process [source](https://docs.airbyte.com/managing-airbyte/connector-updates#airbyte-open-source-oss-and-self-managed-enterprise-sme). Please note that in an Airbyte Open Source or Self-Managed Enterprise instance, syncs are not automatically paused. This differs from what you would see in Airbyte Cloud. Although syncs will not be paused if you miss the deadline for a major update, it is recommended to update promptly to avoid potential compatibility issues [source](https://docs.airbyte.com/managing-airbyte/connector-updates#airbyte-open-source-oss-and-self-managed-enterprise-sme).
k
@kapa.ai when does full sync happens in airbyte? does it happen repeated;y?
u
A message from kapa.ai
u
In Airbyte, a full sync, also known as a full refresh, happens when you initially set up a connection or when you specifically choose the full refresh mode for your data synchronization. During a full refresh sync, Airbyte retrieves all available data requested from the source and writes it in the destination, regardless of whether the data has previously been synced or not. This contrasts with incremental sync modes, which only replicate data that has been inserted or updated since the previous sync [source](https://airbyte.com/tutorials/full-data-synchronization#step-2). There are two variants of full refresh sync: 1. Full Refresh - Overwrite: In this mode, new syncs will destroy all data in the existing destination table and then pull the new data in. Therefore, data that has been removed from the source after an old sync will be deleted in the destination table [source](https://docs.airbyte.com/using-airbyte/core-concepts/sync-modes/full-refresh-overwrite). 2. Full Refresh - Append: In this mode, new syncs will take all data from the sync and append it to the destination table. Therefore, if syncing similar information multiple times, every sync will create duplicates of already existing data [source](https://docs.airbyte.com/using-airbyte/core-concepts/sync-modes/full-refresh-append). The full refresh sync can happen repeatedly based on the sync schedule you set. However, because every full refresh sync operation copies the entire source data set, this approach may be infeasible for large amounts of data, and other replication modes (such as incremental sync) are likely to be more appropriate [source](https://airbyte.com/tutorials/full-data-synchronization#step-2).
k
@kapa.ai what happens after initial sync?
u
A message from kapa.ai
u
After the initial sync, which is a full refresh sync, Airbyte typically performs incremental syncs for subsequent runs. This means that only new or updated data is read and synced, rather than the entire data set. This is more efficient and less resource-intensive than performing a full refresh sync every time. For MySQL and Postgres sources, the incremental syncs read from the binlogs or write-ahead logs (WAL) respectively. However, occasionally, you may see full refresh syncs after the initial run. This can happen if the binlogs or WALs needed for the incremental sync have been removed by MySQL or Postgres. This can occur under certain scenarios, such as when there are lots of database updates resulting in more WAL files than allowed in the
pg_wal
directory, Postgres will purge or archive the WAL files. To prevent this, you can sync the data source more frequently or set a higher
binlog_expire_logs_seconds
or
wal_keep_size
[source](https://docs.airbyte.com/integrations/sources/mysql/mysql-troubleshooting#under-cdc-incremental-mode-there-are-still-full-refresh-syncs) [source](https://docs.airbyte.com/integrations/sources/postgres/postgres-troubleshooting#under-cdc-incremental-mode-there-are-still-full-refresh-syncs). In the case of incremental deduped + history synchronization, the SQL used in the sync after the initial synchronization implements the same logic as the SQL used for all of the previous syncs after the initial synchronization [source](https://airbyte.com/tutorials/incremental-data-synchronization#step-7). You can check the status of your syncs and view logs for each sync in the Job History tab of your connection overview in Airbyte [source](https://docs.airbyte.com/using-airbyte/getting-started/set-up-a-connection#connection-overview).