There is a reason why choosing the 1k batch size f...
# contributing-to-airbyte
u
There is a reason why choosing the 1k batch size for JDBC databases? Could we increase this value to 10k or 50k whats the pro/cons? or, doing this value configurable? @Hudson change the default value to 50k and was able to transfer 18M rows using Mysql in 10 min. Using the default value 1k batch size the sync took 50 min. The Github issue to discuss about it: https://github.com/airbytehq/airbyte/issues/4314
u
There are memory implications of using larger batches. It probably needs to be configurable.
u
Some people have really large rows (multiple mb at least)
u
We should definitely make it possible to use much larger batches when we can.
u
I think we also have an issue for making the batches based off of size instead of rows which might help this
u
I think we also have an issue for making the batches based off of size instead of rows which might help this
If I not wrong @Davin Chia (Airbyte) already made something related to this (batch 1k rows or 2Gb of memory...) is that correct?
u
this should be the related issue: https://github.com/airbytehq/airbyte/issues/3439
u
I didn't end up making any changes to how we read things since it's tricky
u
I think going up to 2k or maybe even 5k is probably fine since we are being conservative now, but anything beyond that we should do some testing
u
maybe a good short term solution is to make this user configurable?