Hi community, I would like to use the CTranslate2 ...
# ask-for-help
m
Hi community, I would like to use the CTranslate2 framework to enable fast transformer inference with converted marianMT model for translation (https://opennmt.net/CTranslate2/guides/transformers.html#marianmt) I managed to use custom runner to do
batch_size=1
inference, but the problem is when I try to do adaptive-batching inference. I tried to save the model with
cloudpickle
using the following code:
Copy code
import ctranslate2
import transformers

model_path = "/home/matthieu/Deployment/CTranslate2/opus-mt-fr-en/default"

translator = ctranslate2.Translator(model_path, device="cuda")
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)

saved_model = bentoml.picklable_model.save_model(
    "CT2_default_opus-mt-fr-en",    # model name in the local model store 
    translator,    # model instance being saved
    signatures={    # model signatures for runner inference
        "__call__": {
            "batchable": True,
            }
    }
)

print(f"Model saved: {saved_model}")
but got the following error:
Copy code
converting 'CT2_default_opus_mt_fr_en' to lowercase: 'ct2_default_opus_mt_fr_en'
Traceback (most recent call last):
File "/home/matthieu/Code/Python/CTranslate2/MarianMT.py", line 125, in
saved_model = bentoml.picklable_model.save_model(
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/bentoml/_internal/frameworks/picklable.py", line 145, in save_model
cloudpickle.dump(model, f)
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 55, in dump
CloudPickler(
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 632, in dump
return Pickler.dump(self, obj)
TypeError: cannot pickle 'ctranslate2.translator.Translator' object
Do you have any idea on how could I enable adaptive-batching with this CTranslate2 framework?
⬆️ 1
👀 1
🍱 1