Shibin Surendranath
04/22/2021, 2:15 PM_airbyte_raw_<object> table as a json object and then it is duplicated in _scd and actual object tables. Overall, the data is duplicated thrice in our datastore. How can we reduce data duplication?
• How can we monitor if the data is getting synced properly? Is there any way from the logs to determine if the sync failed for a specific workspace (like having some workspace identifier in logs) ?
• What is the difference between a free and paid plan. What additional features do we get apart from support?Chris (deprecated profile)
How can we archive the unwanted data from those tables?What is the concern here? is the fact that it takes more space related to cost? Could you share more details? For archiving, you could for example choose not to keep the
_scd tables and reduce the duplication to two tables (raw and final), would that be ok?Chris (deprecated profile)
Is there any way from the logs to determine if the sync failed for a specific workspace (like having some workspace identifier in logs)Workspaces do have IDs, is that what you are referring to as described in the following docs? https://docs.airbyte.io/tutorials/exploring-workspace-folder#identifying-workspace-ids
Deep
04/22/2021, 3:34 PMChris (deprecated profile)
Deep
04/22/2021, 3:36 PMDeep
04/22/2021, 3:37 PMChris (deprecated profile)
append_dedup because it both append to the raw table but then deduplicates the final table
We would need to iterate and implement the dedup sync mode (or upsert_dedup) which would only keep minimal data in destination (even drop raw tables?) to minimize storageDeep
04/22/2021, 3:52 PMChris (deprecated profile)
Deep
04/22/2021, 3:54 PMChris (deprecated profile)
Chris (deprecated profile)
So in the current circumstances there is no way to either not duplicate raw and _scd table or do some archiving of this tables after some timeyou could setup your own transformations running once in a while that can clean up rows from the raw table that are not existing in the deduplicated final table it would be safe to drop the SCD table entirely too if you don’t need it
Deep
04/22/2021, 5:05 PMDeep
04/22/2021, 5:06 PMChris (deprecated profile)
Chris (deprecated profile)
Deep
04/22/2021, 5:32 PMMichel
John (Airbyte)