Tiri Georgiou
09/26/2022, 7:33 AMsalesforce connector
. We notice every time we sync Case
salesforce object there are significant number of duplicates (i.e. per case_id we could have up to 13 duplicates) in the source table (i.e. _airbyte_raw_case
). It looks like the normalization stage takes care of the deduplication in the transform stage, however the duplication of the raw data is causing significant overhead when loading into a DWH. I was thinking of raising this as an issue on GH but want to first make sure:
1. This is not expected behaviour?
2. Could this be a quick fix and if so does anybody know where in the codebase this issue might be originating from?
Thanksuser
09/26/2022, 2:45 PMuser
09/26/2022, 9:11 PMuser
09/26/2022, 9:19 PMBenoit Fayolle
09/27/2022, 10:40 AMuser
09/27/2022, 2:24 PMBenoit Fayolle
09/28/2022, 7:43 AMFull refresh
run on just the Case object. It's still running, currently at Records read: 3885000 (75 GB)
and I just checked our Case table in salesforce has 1058326
records.user
09/28/2022, 9:46 PMBenoit Fayolle
09/29/2022, 8:11 AMBenoit Fayolle
09/29/2022, 8:13 AMBenoit Fayolle
09/29/2022, 8:13 AMBenoit Fayolle
09/29/2022, 3:05 PMBenoit Fayolle
09/29/2022, 3:06 PMSunny Hashmi (Airbyte)
09/29/2022, 7:09 PMSunny Hashmi (Airbyte)
09/29/2022, 7:13 PMCase
object?Sunny Hashmi (Airbyte)
09/29/2022, 7:22 PMBenoit Fayolle
09/30/2022, 10:21 AMCase
object. Opportunity
, Account
and OpportunityLineItem
run fineBenoit Fayolle
09/30/2022, 10:22 AMSophie Lohezic
10/03/2022, 10:10 AMuser
10/03/2022, 6:13 PMMatt Hunt
09/05/2023, 10:33 PMThiago
11/10/2024, 3:54 AMThiago
11/10/2024, 4:12 AM