Hi community I would like to use the CTranslate2 framework t BentoML #ask-for-help

Hi community, I would like to use the CTranslate2 ...

Matthieu Vanhoutte

10/03/2022, 1:36 PM

Hi community, I would like to use the CTranslate2 framework to enable fast transformer inference with converted marianMT model for translation (https://opennmt.net/CTranslate2/guides/transformers.html#marianmt) I managed to use custom runner to do

batch_size=1

inference, but the problem is when I try to do adaptive-batching inference. I tried to save the model with

cloudpickle

using the following code:

Copy code

import ctranslate2
import transformers

model_path = "/home/matthieu/Deployment/CTranslate2/opus-mt-fr-en/default"

translator = ctranslate2.Translator(model_path, device="cuda")
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)

saved_model = bentoml.picklable_model.save_model(
    "CT2_default_opus-mt-fr-en",    # model name in the local model store 
    translator,    # model instance being saved
    signatures={    # model signatures for runner inference
        "__call__": {
            "batchable": True,
            }
    }
)

print(f"Model saved: {saved_model}")

but got the following error:

Copy code

converting 'CT2_default_opus_mt_fr_en' to lowercase: 'ct2_default_opus_mt_fr_en'
Traceback (most recent call last):
File "/home/matthieu/Code/Python/CTranslate2/MarianMT.py", line 125, in
saved_model = bentoml.picklable_model.save_model(
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/bentoml/_internal/frameworks/picklable.py", line 145, in save_model
cloudpickle.dump(model, f)
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 55, in dump
CloudPickler(
File "/home/matthieu/anaconda3/envs/ctranslate2_py3.8/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 632, in dump
return Pickler.dump(self, obj)
TypeError: cannot pickle 'ctranslate2.translator.Translator' object

Do you have any idea on how could I enable adaptive-batching with this CTranslate2 framework?

⬆️ 1

👀 1

🍱 1

Open in Slack

Previous Next