This message was deleted.
# ask-for-help
s
This message was deleted.
s
Hm, I'm not sure, do you have batching enabled?
j
I am using bentoml default configuration, so batching is enabled by default
Maybe I try disable batching entirely and see whether it helps
s
Batching shouldn't be enabled by default anymore; it's only enabled if you explicitly enable it on the model.
(it's enabled in the configuration, but that shouldn't apply to the model unless you saved the model with
batchable=True
)
j
Oh, I did save the model with
batchable=True
as I just copy the model saving code from the documentation. If this is the case do I need to re-save the model without the line or just disable it from the configuration yaml will do?
s
For testing disabling batching in the server should be fine.
j
The inference time has stabilized after disabling batching. Thanks for the pointer! If want to deploy to production, should save the model without the batchable signature?
s
Awesome! You can always disable batching on a per-deployment basis or lower the
max-latency
option, but if you want to disable it globally without having to think about it setting
batchable=False
is probably the easiest option.
j
Aight got it, thanks for your help!