Slackbot
03/16/2023, 3:08 PMMalte
03/16/2023, 3:57 PMcustom runner
with 🤗 optimum onnx runtime. Or are there better alternatives?Sean
03/16/2023, 11:58 PMbentoml.optimum
in the future.Aaron Pham
03/17/2023, 3:07 AMMalte
03/17/2023, 9:32 AMbentoml.transformers
. So far so good. Everything gets saved properly in the folder. It only fails on run time when trying to load the model. This is expected, since 🤗 optimum has its own classes for handling the models. E.g. ORTModelForCausalLM
instead of AutoModelForCausalLM
. That's why the error trace below occurs.
It should be enough to add from optimum.pipelines import ORT_SUPPORTED_TASKS
to the already present SUPPORTED_TASKS
from the regular pipeline and load optimum in bento's transformers.py load_model?! I would be happy to supply a PR for that 🥳.
2023-03-17T10:12:10+0100 [ERROR] [dev_api_server:completion] Traceback (most recent call last):
File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 290, in init_local
self._init(LocalRunnerRef)
File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 137, in _init
object_setattr(self, "_runner_handle", handle_class(self))
File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 24, in __init__
self._runnable = runner.runnable_class(**runner.runnable_init_params) # type: ignore
File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/frameworks/transformers.py", line 474, in __init__
self.pipeline = load_model(bento_model, **kwargs)
File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/frameworks/transformers.py", line 235, in load_model
return transformers.pipeline(task=task, model=bento_model.path, **extra_kwargs)
File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/transformers/pipelines/__init__.py", line 776, in pipeline
framework, model = infer_framework_load_model(
File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/transformers/pipelines/base.py", line 271, in infer_framework_load_model
raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
ValueError: Could not load model /Users/malte/bentoml/bentos/completion/iml4dpweuo6u6mto/models/text-generation-pipeline-gpt2/yxjg2owea6qeimto with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel'>).
Malte
03/17/2023, 9:43 AMAaron Pham
03/20/2023, 3:02 AMsave_pretrained
and from_pretrained
) so loading custom model should now work.
Haven’t had much thoughts for this, but I think for optimum what we can do is to have a bentoml.optimum
that can basically use the same logic from transformers,
similar to bentoml.diffusers
Malte
03/21/2023, 11:24 AMoptimum
does follow the 🤗 API closely.