This message was deleted.
# ask-for-help
s
This message was deleted.
z
Can I manually add or remove Bento Runners on a Bento Service so that I can manually schedule GPU resources?
x
Are you referring to dynamically loading/unloading runners when the bento api-server is running? That is, transitioning from a static inference DAG to a dynamic inference DAG? Our in-development
yatai-serverless
can achieve this to some extent.
👀 2
z
Thank you very much for your response. I do want to dynamically load and unload the bentoRunner, but I am not sure if what I need is a dynamic inference graph. In my business scenario, a bento service has many bento Runners, and my bento service selects which Runner to handle a request based on a request parameter: the model name. Therefore, what I am thinking about is just how to dynamically load and unload Runners on the GPU, but it seems that I do not need an inference graph.
x
Sorry, I made the problem more complicated by using the term 'inference graph'. Based on your description, it seems that yatai-serverless completely meets your needs. When there is no need for a runner, there will be no replicas occupying hardware resources.
z
Thank you for providing additional information. I have briefly reviewed some of the
yatai
documentation. Could you kindly explain the relationship between
yatai-serverless
and
yatai
? If you could provide documentation/materials on
yatai-serverless
, that would be even better. Our team is considering migrating to this solution after conducting research and discussions.
x
The content and images in this architecture document can explain the relationship between Yatai and yatai-serverless very well. Of course, you need to replace yatai-deployment with yatai-serverless (you can think of yatai-deployment as a non-serverless version of yatai-serverless). https://docs.bentoml.org/projects/yatai/en/latest/concepts/architecture.html
z
Thank you very much for your help. I will checkout this document and try out your suggestions.
h
Hi, I'm just wondering if yatai-serveless is ready to use? I can't find any information about it anywhere on yatai doc.