Nitin Jain
03/15/2022, 12:22 PMINSERT
replica strategies, data is being synced but the pipeline is very slow. Looking at the docs we changed the replica strategy to COPY
via giving the s3 credentials in the redshift destination. In COPY
replica strategy, csv files are being written on s3, but only some partial data is being inserted into our redshift db. In the exmaple below, you can see pipeline read 39,100 records, I verified 4 different csvs were written on s3 one having 16252
records, another one having somewhere around 22k records, another one with 2k records.But the number of records written to redshift db is around 16301
. I have seen this if the multiple files are written to s3, only one of the file (randomly chosen ) is being synced with db. I m using full refresh | append mode for the pipeline. Attaching Image for better understandingNitin Jain
03/15/2022, 12:24 PM0.35.45-alpha
version on k8sNitin Jain
03/15/2022, 12:33 PMAugustin Lafanechere (Airbyte)
03/15/2022, 2:13 PMNitin Jain
03/15/2022, 3:58 PMAugustin Lafanechere (Airbyte)
03/15/2022, 4:00 PMAditya Rane
03/15/2022, 9:00 PM