BentoML

Hi Amar, I assume you mean Nvidia TensorRT?

Yes it is supported in the ongoing BentoML &amp; Triton Integration work, where user can use the TensorRT backend in Triton

BentoML itself also supports TensorRT, you can use the TensorRT’s python API to load and run a model via custom runner

<https://docs.bentoml.org/en/latest/concepts/runner.html#custom-runner>

<@UKB4CLKP1> I meant triton inference sever 

got it, a beta version is coming out in the next release

Thanks <@UKB4CLKP1>! Any idea when can we expect the next release ?