Is the normalisation routine performed on the entire raw dat Airbyte #feedback-and-requests

Is the normalisation routine performed on the enti...

Eugene Krall

01/25/2022, 9:55 AM

Is the normalisation routine performed on the entire "raw data" table when a new sync happens? I am using my own dbt transformations with MongoDB as a source and I'm wondering what happens if I remove all the tables but the the "raw data" while changing schema directly in dbt. Will final table be fully replicated from the raw data table on a subsequent sync?

Eugene Krall

01/25/2022, 11:11 AM

Hey Eugene yes when the normalisation runs again it gets replicated.

Eugene Krall

01/25/2022, 1:14 PM

Can I ask you where's the current cursor value is stored in case of incremental sync?

Eugene Krall

01/25/2022, 2:07 PM

@Eugene Krall you can check the following table in our Airbyte DB:

Copy code

docker exec -it airbyte-db bash
psql -U docker
\c airbyte
SELECT * FROM state;

Eugene Krall

01/25/2022, 2:13 PM

thanks!

Eugene Krall

01/25/2022, 2:30 PM

At which point this cursor refreshes? I have a situation when i had a couple of failed syncs but they failed at the normalization stage, so the raw data was already there, after a couple attempts I noticed that there were duplicate records in my raw data. I suspect that cause may be the cursor that updates only after the entire sync was successful.

Eugene Krall

01/25/2022, 3:29 PM

It depends of the source connector you're using, the checkpointing can be done in different manners. You can read more about this here. Feel free to share which connector you're using and I'll have a look to its code.

3 Views

Open in Slack

Previous Next