This message was deleted.
# ask-for-help
s
This message was deleted.
c
cc @larme (shenyang)
a
can you make the tokenizer data as a partial kwargs?
l
Hi Suhas, could you upgrade BentoML to the latest version by
pip install -U bentoml
? The latest BentoML shouldn't have this limitation
s
I have version 1.0.13, still i get these issue
l
Hi Suhas, that part of codes should only be invoked when the model input type description is a tensor type. Maybe there's a hidden bug. We will appreciate if you could provide a minimal example to reproduce this behaviour. Thanks!
s
For example: you can convert bert-base model to onnx, create onnx runner with providers CUDA load tokenizer as below
Copy code
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
tokeinzer_data=tokenizer.tokenize("I have a new GPU!")
Initialise model locally For inference use onnx_runner onnx_runner.run.run(tokeinzer_data)
l
Could you try that?
Copy code
tokens = list(tokenizer("this is a sample", return_tensors="np").values())
runner.run.run(*tokens)
s
Thanks it work's
l
Thanks for the feedback. We will add a bert/transfomers ONNX example in our docs.
s
Perfect, providing example would help new developers. Thanks for looking into