This message was deleted BentoML #ask-for-help

Join Slack

This message was deleted.

# ask-for-help

Slackbot

05/15/2023, 2:49 PM

This message was deleted.

🍱 1

🏁 1

Jiang

05/16/2023, 6:06 AM

Would you mind sharing the service.py ?

Flo

05/16/2023, 8:19 AM

Sure 🙂 The service.py

Copy code

import bentoml

from <http://bentoml.io|bentoml.io> import Text, Image
from PIL.Image import Image as PILImage

processor_runner = bentoml.transformers.get(
    "blip2_opt_2-7-b_processor:dcplr3pe4cqglgv3"
).to_runner()
model_runner = bentoml.transformers.get(
    "blip2_opt_2-7-b_model:dge4qo7e4cvr7gv3"
).to_runner()

svc = bentoml.Service("image_generation", runners=[processor_runner, model_runner])


@svc.api(input=Image(), output=Text())
def generate_speech(img: PILImage):
    inputs = processor_runner.run(images=[img], return_tensors="pt")
    generated_caption_ids = model_runner.generate.run(**inputs)
    generated_text = processor_runner.batch_decode.run(
        generated_caption_ids, skip_special_tokens=True
    )
    return generated_text[0]

bentofile.yaml

Copy code

service: "blip2_image_captioning_service:svc"
labels:
  owner: dwts
  stage: dev
include:
  - "*.py"
python:
  extra_index_url:
    - "<https://download.pytorch.org/whl/cu116>"
  packages:
    - "torch==1.13.1+cu116"
    - "torchvision==0.14.1+cu116"
    - transformers==4.29
    - Pillow==9.2.0
docker:
  distro: debian
  python_version: "3.10"
  cuda_version: "11.6.2"

Flo

05/17/2023, 8:38 AM

@Jiang Can you spot any obvious mistakes I made 🙂 ? I also had a glance in the bentoml sourcecode and I think all pre-trained transformer models are treated as transformers pipelines which do accept the

device

kwarg. However, transformer models have to be transferred to a GPU via

.to('cuda')

. Is this maybe the issue or did I get it wrong?

Jiang

05/17/2023, 9:15 AM

Yes. Would you mind sharing the save_model script as well? Is that mode object a pipeline or a pretrained model?

Flo

05/17/2023, 9:39 AM

Sure, here's the script:

Copy code

from transformers import Blip2Processor, Blip2ForConditionalGeneration
import bentoml

processor = Blip2Processor.from_pretrained("Salesforce/blip2-opt-2.7b")
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b")

bentoml.transformers.save_model("blip2_opt_2-7-b_processor", processor)
bentoml.transformers.save_model("blip2_opt_2-7-b_model", model,
                                signatures={
                                    "generate": {
                                        "batchable": True,
                                        "batch_dim": 0,
                                    }
                                })

It's only a pretrained model, not a pipeline 🙂

Jiang

05/17/2023, 9:49 AM

Got it. I'm investigating it

🙏 1

Flo

05/17/2023, 9:53 AM

Thank you very much! I appreciate

Flo

05/22/2023, 8:08 AM

Hi @Jiang, did you find anything looking suspicious? If it's indeed the bug I mentioned, I can create a GH Issue and try to fix it?

Jiang

05/23/2023, 3:22 AM

@Flo I believe we will be able to update tmr. It should be a bug introduced recently.

Jiang

05/24/2023, 5:32 AM

Hi @Flo. https://github.com/bentoml/BentoML/pull/3882 This PR fixes that issue

Jiang

05/24/2023, 5:32 AM

You can wait for our coming release, or just install the main branch

Open in Slack

Previous Next