This message was deleted.
# ask-for-help
s
This message was deleted.
c
Yes BentoML will initialize the model in memory for online inference requests and minimize latency. Is it because you want to host many models and unload the models not actively being used?