https://linen.dev logo
d

Davin Chia (Airbyte)

05/17/2021, 3:54 AM
@charles you recently got rid of the on disk queue right? so in theory we should no longer have disk space issues?
u

user

05/17/2021, 3:54 AM
Yup.
u

user

05/17/2021, 3:55 AM
however, we should see more memory usage since the buffer is now in mem?
u

user

05/17/2021, 3:56 AM
No. Because of back pressure from the destination the destination memory usage should be the same.
u

user

05/17/2021, 3:57 AM
thanks!
u

user

05/17/2021, 4:22 AM
reading the code now - it is more accurate to say the destination memory requirement increased by batch size * average size of record with this change? (since we are keeping up to batch size of the record message in the in memory buffer and all of this was prev in the on disk queue)
u

user

05/17/2021, 5:31 AM
Nope. It was always the case that a full batch got pulled into memory in the previous implementation.
u

user

05/17/2021, 5:32 AM
cool!
u

user

05/17/2021, 6:21 AM
ah yes because we eventually pull everything into memory before writing the messages to the destination
u

user

05/17/2021, 7:42 AM
is it in 0.22.3?
u

user

05/17/2021, 7:48 AM
this would be on the destination side, and would affect destinations not using COPY strategies to write data
u

user

05/17/2021, 7:49 AM
this PR is where the destinations received this update (as well as a checkpointing feature)
u

user

05/18/2021, 1:58 PM
I did reset 2 PGSQL->BQ connections (all connectors are the latest) and forced syncs. Disk usage still drops only when a sync is about to finish. On the pic it’s dropped when 1 of 2 is finished.