BentoML

Hi Ilya, internally BentoML can handle batches with the consistent order. However are you sending hundreds of thousands of images over http as one request? That may result a huge payload and degrade the IO performance.

For the second question, yes BentoML can combine models into single api endpoint. You just need to convert each model to its own runner and call runners inside the api endpoint like:

```@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
async def classify(input_series: np.ndarray) -&gt; np.ndarray:
    output1 = await runner1.async_run(input_series)
    output2 = await runner2.async_run(output1)
    return output2```

Thanks a lot, that's what we thought. Regarding the single api endpoint, that would mean using Bento runners, while we plan on exporting models as separate Docker containers