This message was deleted.
# ask-for-help
s
This message was deleted.
l
Shouldn’t be that slow. Is
torch.cuda.is_available()
still return
False
?
If you run
bentoml serve
, you should see vram usage in the output of
nvidia-smi
, or else the model is not utilizing GPU
l
its using cuda now
I fixed that
my gpu/vram is like 100%
i have to set
PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:768
or lower otherwise i get memory errors