Tiri Georgiou
09/23/2022, 9:28 AMCase
salesforce object there are significant number of duplicates (i.e. per case_id we could have up to 13 duplicates) in the source table (i.e. _airbyte_raw_case
). It looks like the normalization stage takes care of the deduplication in the transform stage, however the duplication of the raw data is causing significant overhead when loading into a DWH. I was thinking of raising this as an issue on GH but want to first make sure:
1. This is not expected behaviour?
2. Could this be a quick fix and if so does anybody know where in the codebase this issue might be originating from?
Thanks