This message was deleted.
# ask-for-help
s
This message was deleted.
🍱 1
🏁 1
j
Would you mind sharing the service.py ?
f
Sure πŸ™‚ The service.py
Copy code
import bentoml

from <http://bentoml.io|bentoml.io> import Text, Image
from PIL.Image import Image as PILImage

processor_runner = bentoml.transformers.get(
    "blip2_opt_2-7-b_processor:dcplr3pe4cqglgv3"
).to_runner()
model_runner = bentoml.transformers.get(
    "blip2_opt_2-7-b_model:dge4qo7e4cvr7gv3"
).to_runner()

svc = bentoml.Service("image_generation", runners=[processor_runner, model_runner])


@svc.api(input=Image(), output=Text())
def generate_speech(img: PILImage):
    inputs = processor_runner.run(images=[img], return_tensors="pt")
    generated_caption_ids = model_runner.generate.run(**inputs)
    generated_text = processor_runner.batch_decode.run(
        generated_caption_ids, skip_special_tokens=True
    )
    return generated_text[0]
bentofile.yaml
Copy code
service: "blip2_image_captioning_service:svc"
labels:
  owner: dwts
  stage: dev
include:
  - "*.py"
python:
  extra_index_url:
    - "<https://download.pytorch.org/whl/cu116>"
  packages:
    - "torch==1.13.1+cu116"
    - "torchvision==0.14.1+cu116"
    - transformers==4.29
    - Pillow==9.2.0
docker:
  distro: debian
  python_version: "3.10"
  cuda_version: "11.6.2"
@Jiang Can you spot any obvious mistakes I made πŸ™‚ ? I also had a glance in the bentoml sourcecode and I think all pre-trained transformer models are treated as transformers pipelines which do accept the
device
kwarg. However, transformer models have to be transferred to a GPU via
.to('cuda')
. Is this maybe the issue or did I get it wrong?
j
Yes. Would you mind sharing the save_model script as well? Is that mode object a pipeline or a pretrained model?
f
Sure, here's the script:
Copy code
from transformers import Blip2Processor, Blip2ForConditionalGeneration
import bentoml

processor = Blip2Processor.from_pretrained("Salesforce/blip2-opt-2.7b")
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b")

bentoml.transformers.save_model("blip2_opt_2-7-b_processor", processor)
bentoml.transformers.save_model("blip2_opt_2-7-b_model", model,
                                signatures={
                                    "generate": {
                                        "batchable": True,
                                        "batch_dim": 0,
                                    }
                                })
It's only a pretrained model, not a pipeline πŸ™‚
j
Got it. I'm investigating it
πŸ™ 1
f
Thank you very much! I appreciate
Hi @Jiang, did you find anything looking suspicious? If it's indeed the bug I mentioned, I can create a GH Issue and try to fix it?
j
@Flo I believe we will be able to update tmr. It should be a bug introduced recently.
Hi @Flo. https://github.com/bentoml/BentoML/pull/3882 This PR fixes that issue
You can wait for our coming release, or just install the main branch