Jiyuan Zheng
09/06/2021, 6:27 PM[DEPRECATED] Marcos Marx
Davin Chia (Airbyte)
Jiyuan Zheng
09/07/2021, 3:03 AMJiyuan Zheng
09/07/2021, 5:16 AMSubodh (Airbyte)
replication slots
so I assume that you are using CDC. In that case the batch_size
can not be set from your end.
2. Which phase of the sync is taking most of the time? Is it reading the data from the source or loading it into the destination or the normalization step?
3. How big do your WAL logs get? Am curious if we are spending too much time trying to identify where to start from in the WAL logs in case they get too big.
4. The failure that you observed, which part of the sync did you observe it in? The source/destination/normalization. Also were these interim failures i.e. when you restart the sync, did they go away on their own. Am trying to understand if its related to protocol or the connectorsJiyuan Zheng
09/08/2021, 1:51 AMbatch_size
will still be used during the initial sync?
2. The main issue is during initial sync. It is taking long time and the connection is unstable. I haven’t encounter much issue during subsequent syncs yet.
3. The WAL log size will be about 5GB per hour in production. Currently, I am testing on staging which has about half of the 2 ~ 2.5GB per hour
4. normalization tends to fail quite often so we decided to turn it off and purely evaluate the CDC feature now. Airbyte is reprocessing all the data during normalization right now, so we cann’t use that as is and we are potentially looking for building our own incremental . If you canJiyuan Zheng
09/08/2021, 1:52 AMSubodh (Airbyte)
s
Jiyuan Zheng
09/10/2021, 2:19 AMs