This message was deleted.
# ask-for-help
s
This message was deleted.
🏁 1
replied 1
🍱 2
j
This is just a semi-educated guess, but you may need to specify in your configuration which GPUs you want a runner to run on:
Copy code
runners:
 pytorch_mnist:
   resources:
     <http://nvidia.com/gpu|nvidia.com/gpu>: [0, 1]
πŸ‘ 1
I'm not sure if there's an "all" option
t
Unfortunately I tried forcing the runners repartion in the configuration, but bentoml identify (correctly?) one GPU: `Error: [bentoml-cli]
serve
failed: GPU device index in [0, 1] is greater than the system available: [0]` But I would like to specify multi-instance gpu "partition"
j
ohhhh gotcha, there's one physical GPU partitioned into 2 logical GPUs. That's cool, didn't know you could do that.
πŸ‘ 2
t
Yes, this is a feature that comes with the NVIDIA Ampere architecture. The official documentation specifies that it is a way to securely partition "into up to seven separate GPU Instances for CUDA applications, providing multiple users with separate GPU resources for optimal GPU utilization".
I have a node pool in my k8s cluster that is configured with such partition, so I would like to exploit the full potential of this optimisation with my bentoML service πŸ™‚
j
That's super cool. I know partial GPU usage is on the roadmap for Bento, but I don't know if this is what they were indicating.
c
cc @Jiang @larme (shenyang) any suggestions?
j
Hi. Yeah. It is a valid usage. Here's the official guide to set it up. https://aws.amazon.com/blogs/containers/utilizing-nvidia-multi-instance-gpu-mig-in-a[…]n-ec2-p4d-instances-on-amazon-elastic-kubernetes-service-eks/ My personal recommendation is to put an eye on: 1.
step 3
make sure nvidia device plugin is properly set up with MIG enabled. 2. use something like
<http://nvidia.com/mig-1g.5gb|nvidia.com/mig-1g.5gb>: 1
instead of
<http://nvidia.com/gpu|nvidia.com/gpu>
in the resources limit.
πŸ™ 1
a
Hi @Thomas Jacquemin, has this problem been solved yet?
j
Hi @Thomas Jacquemin I will mark this tread as resolved for now.