do you have minion config to check what’s the push segment uri?
n
Nick Bowles
02/26/2021, 9:05 PM
Sure let me post after this meeting. It’s the same minion config that was working earlier, but I did change some transformConfig for the table
I’m also using 100 minions 😛
x
Xiang Fu
02/26/2021, 9:19 PM
it could also be that too many parallel push occupied the controller threads which caused the time out
We observed this issue during data bootstrapping of super huge data set
one thing you can optimize is to set configs to store segments to deep store like s3 or gcs then do URI push
default is tar push, which is reliable but costly 🙂
n
Nick Bowles
02/26/2021, 9:21 PM
I’ve got it set to URI now. Believe I tried METADATA and TAR before but it complained. Will try again, thanks for the help.
x
Xiang Fu
02/26/2021, 9:22 PM
got it, then the only thing I can think of is to increase the default retry
also you are seeing some segments been added right?
n
Nick Bowles
02/26/2021, 9:23 PM
Correct it was working before and a few segments would be “BAD” but after reloading they were fine, so sounds like it could be the threads issue you mentioned.
But then I scaled the minions way up from like 20 to 100 to see how quickly I can ingest the data
x
Xiang Fu
02/26/2021, 9:24 PM
😛
got it
per table level, for idealstates update, this is single point of processing
so all threads will be working on the same zNode
n
Nick Bowles
02/26/2021, 9:27 PM
Should I try to scale zookeeper to speed up how fast the entries are done? Gave it more heap size as a precaution.
doesn’t look like it’s using many resources
x
Xiang Fu
02/26/2021, 9:28 PM
you can give more cpu and memory and see if that helps the speed