This message was deleted.
# ask-for-help
s
This message was deleted.
s
Also, I have another question. When we use “async_run” instead of “run” in the previous BentoML version, it seems that it processes the single request at a single time. Is it right?
l
Hi Sangeon, what does "processes the single request at a single time" mean here? Do you mean there's no batching?
@sauyon could you help with the
max_latency
setting? Thanks!
s
max_latency_ms
being under batching is misleading as it now affects all (including unbatched) workloads, we're looking to fix that in 1.1.
s
Yes, we are now not using adaptive batching and if there are multiple request at the same time, the BentoML container returns the message above.
s
I believe you can just raise
runners.batching.max_latency_ms
and that should solve the problem!