Matthieu Vanhoutte
10/03/2022, 1:36 PMbatch_size=1
inference, but the problem is when I try to do adaptive-batching inference. I tried to save the model with cloudpickle
using the following code:
import ctranslate2
import transformers
model_path = "/home/matthieu/Deployment/CTranslate2/opus-mt-fr-en/default"
translator = ctranslate2.Translator(model_path, device="cuda")
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
saved_model = bentoml.picklable_model.save_model(
"CT2_default_opus-mt-fr-en", # model name in the local model store
translator, # model instance being saved
signatures={ # model signatures for runner inference
"__call__": {
"batchable": True,
}
}
)
print(f"Model saved: {saved_model}")
but got the following error:
converting 'CT2_default_opus_mt_fr_en' to lowercase: 'ct2_default_opus_mt_fr_en'
Traceback (most recent call last):
File "/home/matthieu/Code/Python/CTranslate2/MarianMT.py", line 125, in
saved_model = bentoml.picklable_model.save_model(
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/bentoml/_internal/frameworks/picklable.py", line 145, in save_model
cloudpickle.dump(model, f)
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 55, in dump
CloudPickler(
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 632, in dump
return Pickler.dump(self, obj)
TypeError: cannot pickle 'ctranslate2.translator.Translator' object
Do you have any idea on how could I enable adaptive-batching with this CTranslate2 framework?