This message was deleted.
# ask-for-help
s
This message was deleted.
s
• processing time is the amount of time that the runner takes to actually run the runner code on the inputs it contains • wait time is the amount of time the runner scheduler will wait before dispatching the next batch • latency is simply the end-to-end latency of the service. I think that statement is wrong, in adaptive batching mode we're mostly optimizing for throughput and not latency, though arguably they are linked. Re: the optimizer's debug logs: •
o_w
is the average wait time observed for requests that are served (with some decay), this is used to calculate wait time. •
o_a
and
o_b
are simply the slope and intercept of the regression that we run for processing time