when using CDC, do i still need to select `increme...
# ask-community-for-troubleshooting
g
when using CDC, do i still need to select
incremental dedupe
? if it’s reading from the logs, wont it just modify the table in the destination accordingly? and thus remove the need for the
scd
table as well as the final table
@[DEPRECATED] Marcos Marx @John (Airbyte) @Subodh (Airbyte) @s apologies for the cc’cing but any suggestions on this? if i have a SCD source table e.g. a user table where email is changed. wouldn’t CDC handle this change. so
incremental
and
incremental dedupe
is not applicable? im not sure i understand how those types of loading apply to CDC. i would like to avoid the airbyte
scd
table and dedupe process. and i believe CDC handles this - it reads logs to see whats changed and perform the same operation.
i.e. would
incremental
or
full refresh — append
be sufficient to prevent the same record appearing more than once when it’s been modified. if we’re using CDC, the logs will state how to modify existing row
Subsequent syncs will use the logs to determine which changes took place since the last sync and update those. Airbyte keeps track of the current log position between syncs.
A single sync might have some tables configured for Full Refresh replication and others for Incremental. If CDC is configured at the source level, all tables with Incremental selected will use CDC. All Full Refresh tables will replicate using the same process as non-CDC sources. However, these tables will still include CDC metadata columns by default.
all tables with Incremental selected will use CDC
--> does that mean i do not need to select
incremental + dedupe
if the underlying source table does not have duplicate rows?
u
@gunu Airbyte used CDC to retrieve soft-deletes and updates in records from the source but Airbyte still send all data to raw tables. The SCD tables are necessary to reproduce the actual state of your table (see normalization process to check active row). Maybe I'm wrong... Daniel did you already made some tests using incremental/incremental dedup with cdc?
g
@[DEPRECATED] Marcos Marx thanks so much for that context. i am familiar with the normalization process and additional context but missing this one point.
CDC retrieves soft-deletes and updates in records
so that means on
incremental append
(if on next sync there is an update) it will append a new row to the destination table and not update the record in the destination?
u
If you do
incremental append
probably you will have the create and update state and you need to solve yourself to clean the data in destination to only display the latest version of data. (btw, I'll try to simulate this and come back to you)