Slackbot
05/15/2023, 2:49 PMJiang
05/16/2023, 6:06 AMFlo
05/16/2023, 8:19 AMimport bentoml
from <http://bentoml.io|bentoml.io> import Text, Image
from PIL.Image import Image as PILImage
processor_runner = bentoml.transformers.get(
"blip2_opt_2-7-b_processor:dcplr3pe4cqglgv3"
).to_runner()
model_runner = bentoml.transformers.get(
"blip2_opt_2-7-b_model:dge4qo7e4cvr7gv3"
).to_runner()
svc = bentoml.Service("image_generation", runners=[processor_runner, model_runner])
@svc.api(input=Image(), output=Text())
def generate_speech(img: PILImage):
inputs = processor_runner.run(images=[img], return_tensors="pt")
generated_caption_ids = model_runner.generate.run(**inputs)
generated_text = processor_runner.batch_decode.run(
generated_caption_ids, skip_special_tokens=True
)
return generated_text[0]
bentofile.yaml
service: "blip2_image_captioning_service:svc"
labels:
owner: dwts
stage: dev
include:
- "*.py"
python:
extra_index_url:
- "<https://download.pytorch.org/whl/cu116>"
packages:
- "torch==1.13.1+cu116"
- "torchvision==0.14.1+cu116"
- transformers==4.29
- Pillow==9.2.0
docker:
distro: debian
python_version: "3.10"
cuda_version: "11.6.2"
Flo
05/17/2023, 8:38 AMdevice
kwarg. However, transformer models have to be transferred to a GPU via .to('cuda')
. Is this maybe the issue or did I get it wrong?Jiang
05/17/2023, 9:15 AMFlo
05/17/2023, 9:39 AMfrom transformers import Blip2Processor, Blip2ForConditionalGeneration
import bentoml
processor = Blip2Processor.from_pretrained("Salesforce/blip2-opt-2.7b")
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b")
bentoml.transformers.save_model("blip2_opt_2-7-b_processor", processor)
bentoml.transformers.save_model("blip2_opt_2-7-b_model", model,
signatures={
"generate": {
"batchable": True,
"batch_dim": 0,
}
})
It's only a pretrained model, not a pipeline πJiang
05/17/2023, 9:49 AMFlo
05/17/2023, 9:53 AMFlo
05/22/2023, 8:08 AMJiang
05/23/2023, 3:22 AMJiang
05/24/2023, 5:32 AMJiang
05/24/2023, 5:32 AM