This message was deleted BentoML #ask-for-help

Join Slack

This message was deleted.

# ask-for-help

Slackbot

09/22/2022, 8:46 AM

This message was deleted.

👀 1

🍱 1

09/23/2022, 4:48 PM

hello @Yakir Saadia I suggest you to move to BentoML v1.0 or later version. Is there any constrains that prevent you to do that? feel free to DM me

Sean

09/25/2022, 7:51 AM

What batch sizes do you see? Is it capable of serving all requests sent successfully?

Yakir Saadia

09/25/2022, 10:03 AM

It serves all the requests sent successfully, but it didn't form a batch larger than 3.

Sean

09/25/2022, 10:26 AM

Your requests are most likely very fast and the server is capable of handling request independently without batching. You can try increasing the throughput and try pushing your server to the limit more.

Yakir Saadia

09/25/2022, 10:30 AM

But the server takes to long to start handling the request so I don't get the expected request per second I should have

Sean

09/26/2022, 12:09 AM

Do you mean the server takes time to warm up? The model may take some time to be loaded into memory. You may want to setup some warm up requests to get the server into a ready state.

Yakir Saadia

09/27/2022, 1:29 PM

But it doesn't only happen to me when the server just starts. It happens throughout the the lifetime of the app

09/28/2022, 2:48 AM

@Yakir Saadia can you share more information on this finding?

09/28/2022, 3:30 AM

what’s your configuration for the max latency and max batch size?

09/28/2022, 3:35 AM

@Yakir Saadia actually do you have time for office hour? we can have more productive meeting over zoom

Yakir Saadia

09/28/2022, 9:34 PM

@Bo I have mentioned in the main message in this thread, the max batch size is 30 and max latency is 1000000. I would be happy to have a zoom cool on it

Open in Slack

Previous Next