This message was deleted.
# ask-for-help
s
This message was deleted.
c
This largely depends on the runnable implementation, most of the built in framework runners should not have this problem. How much memory usage are you observing?
Have you confirmed that the model loaded to GPU?
m
hi thanks for the fast response. I’m using the built in onnx runner for all these models and all of them were correctly loaded in gpu
I see also that the RAM used by each runner process is almost the same
ok I just found that it might be related to a custom runner that I was importing but not using. After removing the import, each runner uses around 1.2G of RAM (instead of 2.6G as seen in the screenshot). Is 1.2G a normal number?
j
Yes it is normal for most large models cc @Aaron Pham
👍 1
m
I investigated a bit more and I think that the high RAM usage is caused by having both ONNX and PyTorch models in the same service: • If I only keep the ONNX models each runner process takes around 1.7G • If I only keep the PyTorch models each runner process takes around 1.8G • If I keep both (or just
import torch
somewhere in the code) the ONNX runner processes go up to 3GB. Interestingly the PyTorch runners still keep using 1.8G. Does this make sense? Is there a way to avoid this happening (importing torch, I guess) in the ONNX runners?
a
Can you send an example of your runnable implementation?
m
Hi @Aaron Pham it is not affected by my runnable (I removed it). I presume that is caused by having PyTorch and ONNX runners loaded in the same service as I said in my previous message