Slackbot
02/07/2023, 10:21 PMAaron Pham
02/08/2023, 3:20 PMSuhas
02/08/2023, 5:02 PMdevice = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
pipeline(task="text2text-generation", model=model, tokenizer=tokenizer,device=device)
Further i call runner for inference
runner = bentoml.transformers.get("model:latest").to_runner()
service = bentoml.Service("model", runners=[runner])
@service.api():
runer.run("its an example")
Aaron Pham
02/09/2023, 5:40 AMSuhas
02/09/2023, 8:19 AM