Clovis Masson
11/25/2021, 10:58 AMnode A : 1 worker
, node B : 1 worker
and node C : 2 workers
). When syncing, if my source-worker
and destination-worker
are correctly distributed, I’m able to process about 25M rows in one hour (with a well distributed CPU load). However, if by any chance source-worker
and destination-worker
are both started in node C
, then the node’s CPU goes up to 190% (against 10% and 10% for the two others) and time processing is much more slower as I’m only able to process about 15M rows within an hour.
Not sure if it's an actual request as I don't know if there is an existing strategy to avoid this situation but is there a way to force the parallelization of workers on different nodes to maximize performance ?Andik Achmad
11/29/2021, 8:20 AMuser
11/29/2021, 8:21 AMuser
11/29/2021, 8:21 AMuser
11/29/2021, 9:32 AM