I have multiple bento containers, each need to use...
# ask-for-help
p
I have multiple bento containers, each need to use a GPU. but the number of GPU resources that I need is more then I have. how can I communicate to the other bentos to offload their model from the GPU on an in coming request?
👀 1