It is possible that those queries are waiting on the server side long enough, and hit timeout when they start executing
b
Buchi Reddy
07/22/2020, 5:18 AM
Exactly my theory. That’s where I was digging into the code in CombinedPlan, where I saw a max no. of threads that we can use per query and that depends on the no. processors 🙂
Because we’re running in k8s and giving less CPU, i’m suspecting the threads in the executor could be an issue
is there a metric or something to confirm that?
m
Mayank
07/22/2020, 5:18 AM
Search for something like schedularwait
It likely implies you are sending more qps than the provisioned nodes can handle
Number of serving threads is limited to 2x the number of cores iirc
what's the qps you are sending?
b
Buchi Reddy
07/22/2020, 5:23 AM
it’s less than 2. per min it’s < 60
our CPU allocation was too low. Though the server doesn’t consumer a lot of CPU, some thread pool sizes are initialized/capped based on the no. of processors available.
m
Mayank
07/22/2020, 12:26 PM
Yeah, in that case, with 10k segments, I can see how the queue might grow.