This message was deleted BentoML #ask-for-help

Join Slack

This message was deleted.

# ask-for-help

Slackbot

03/16/2023, 3:08 PM

This message was deleted.

Malte

03/16/2023, 3:57 PM

My favoured approach is to create a

custom runner

with 🤗 optimum onnx runtime. Or are there better alternatives?

Sean

03/16/2023, 11:58 PM

Using a custom runner is currently the best approach to using 🤗 optimum. We may add deeper integration through

bentoml.optimum

in the future.

🙏 1

Aaron Pham

03/17/2023, 3:07 AM

Hi there, is there any errors that you are currently running into?

Malte

03/17/2023, 9:32 AM

Thanks for the input! A custom runner does work flawlessly. On the error: Even without it, I was able to import the ONNX model into the bento model store as a

bentoml.transformers

. So far so good. Everything gets saved properly in the folder. It only fails on run time when trying to load the model. This is expected, since 🤗 optimum has its own classes for handling the models. E.g.

ORTModelForCausalLM

instead of

AutoModelForCausalLM

. That's why the error trace below occurs. It should be enough to add

from optimum.pipelines import ORT_SUPPORTED_TASKS

to the already present

SUPPORTED_TASKS

from the regular pipeline and load optimum in bento's transformers.py load_model?! I would be happy to supply a PR for that 🥳.

Copy code

2023-03-17T10:12:10+0100 [ERROR] [dev_api_server:completion] Traceback (most recent call last):
  File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 290, in init_local
    self._init(LocalRunnerRef)
  File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 137, in _init
    object_setattr(self, "_runner_handle", handle_class(self))
  File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/local.py", line 24, in __init__
    self._runnable = runner.runnable_class(**runner.runnable_init_params)  # type: ignore
  File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/frameworks/transformers.py", line 474, in __init__
    self.pipeline = load_model(bento_model, **kwargs)
  File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/bentoml/_internal/frameworks/transformers.py", line 235, in load_model
    return transformers.pipeline(task=task, model=bento_model.path, **extra_kwargs)
  File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/transformers/pipelines/__init__.py", line 776, in pipeline
    framework, model = infer_framework_load_model(
  File "/Users/malte/miniconda3/envs/bento/lib/python3.10/site-packages/transformers/pipelines/base.py", line 271, in infer_framework_load_model
    raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
ValueError: Could not load model /Users/malte/bentoml/bentos/completion/iml4dpweuo6u6mto/models/text-generation-pipeline-gpt2/yxjg2owea6qeimto with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel'>).

Malte

03/17/2023, 9:43 AM

I think that the ability to use 🤗 ONNX implementation with optimum would be a very valuable addition as it significantly reduces the complexity to use production-ready models. And this IMHO fits the 🍱 vision perfectly.

Aaron Pham

03/20/2023, 3:02 AM

Hi @Malte https://github.com/bentoml/BentoML/pull/3684 will add supports for saving any arbitrary transformers model that follow their spec (i.e: implementing a

save_pretrained

and

from_pretrained

) so loading custom model should now work. Haven’t had much thoughts for this, but I think for optimum what we can do is to have a

bentoml.optimum

that can basically use the same logic from

transformers,

similar to

bentoml.diffusers

❤️ 1

Malte

03/21/2023, 11:24 AM

Thanks a lot! I will have a look 🙂. And yes this should work as it is in principle just another import but

optimum

does follow the 🤗 API closely.

7 Views

Open in Slack

Previous Next