Ting Chen
04/07/2025, 1:36 AMBentoML supports adaptive batching, a dynamic request dispatching mechanism that intelligently groups multiple requests for more efficient processing. It continuously adjusts batch size and window based on real-time traffic patterns.
The reason I ask is because I have set up some load test, and logged out the real batch size received from the inference function, but I don't see any dynamic size coming in. So I am wondering any setup was not correct or I misunderstand this feature. Can anyone help me with this? 🙏Anthony Guselnikov
04/07/2025, 7:17 PMAnthony Guselnikov
04/07/2025, 7:18 PM@bentoml.service(
timeout=300,
traffic={"timeout": 300},
)
Egmont
04/09/2025, 5:37 AMList
, is there are any sort of indexing for identifying the output (e.g. which element in the list should go to which request as response). Or is it relies on the order of the list? Thank you!Pang Jin Hui
04/09/2025, 10:52 AMMinh Huy VĹ© Nguyá»…n
04/10/2025, 3:30 AMAntonio Bevilacqua
04/14/2025, 9:17 AMArnault Chazareix
04/14/2025, 11:25 AMTokenizeService
that takes as input a path / key to a tokenizer.
I would like my main service to depends on
TokenizeService("my_tokenizer_1")
and TokenizeService("my_tokenizer_2")
Thanks for your help and for bentoml 🙂Rahul Rawat
04/14/2025, 12:08 PMimport bentoml
from dextor.src.api_service import Summarization
if __name__ == "__main__":
bentoml.serve(Summarization)
Shani Rosen
04/17/2025, 2:00 PM@bentoml.service(
traffic={"timeout": 10, "workers": 1},
)
then
@bentoml.api
def test(self):
time.sleep(30)
print("Hello")
return
I'm getting indeed a timeout after 10 seconds, but then in the server itself looks like the process is still running and after 30 seconds I see the "Hello", Anyone knows why that happens / how can I kill the process in the case of timeout?
Using bentoml 1.4.10
Thanks!Eddie Dunn
04/17/2025, 2:58 PMSachin Kumar
04/17/2025, 2:58 PMJerry Harrow
04/17/2025, 5:44 PMTakuo Sato
04/18/2025, 5:16 AMIdan
04/18/2025, 9:29 AMNeural Manacle
04/20/2025, 6:03 AMGuy Melul
04/24/2025, 12:38 PMPierre Lecerf
04/28/2025, 8:21 AMcargo
& cie at build time, which is not a part of the default debian
image BentoML uses, and many other base images as a matter of fact.
How do you deal with these? Do you systematically override the Dockerfile template?
Simply using system_packages
ends-up bloating the final image right?Oded Valtzer
04/28/2025, 8:24 PMZuyang Liu
04/29/2025, 4:35 AMimage=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
time=2025-04-29T04:32:07.876Z level=INFO source=main.go:1355 msg="bento layer has been pushed to image registry" context-path=/workspace/buildcontext
bucket=bc-mt-guc1-infra
image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
time=2025-04-29T04:32:07.876Z level=INFO source=main.go:1357 msg="pushing image metadata to image registry..." context-path=/workspace/buildcontext
bucket=bc-mt-guc1-infra
image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
time=2025-04-29T04:32:08.036Z level=INFO source=main.go:1362 msg="image metadata has been pushed to image registry" context-path=/workspace/buildcontext
bucket=bc-mt-guc1-infra
image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
time=2025-04-29T04:32:08.036Z level=INFO source=main.go:1364 msg="successfully pushed image to image registry" context-path=/workspace/buildcontext
bucket=bc-mt-guc1-infra
image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
+ exit 0
MatÄ›j Ĺ mĂd
04/29/2025, 9:27 PMbentoml build
I end up with:
shutil.Error: Destination path '/home/matej/bentoml/tmp/tmptvzmstoibentoml_bento_equivision/env/python/wheels/mmdet-3.3.0.tar.gz' already exists
The sdist is made twice and fails after second round: https://gist.github.com/smidm/2c1f914ce5289b999a213dd747110f1f
Any clue how to diagnose this?MatÄ›j Ĺ mĂd
04/30/2025, 7:26 PMLiu Muzhou
05/07/2025, 1:25 AMKevin Cui (Black-Hole)
05/07/2025, 3:47 AMVincent Lu
05/07/2025, 4:03 AMVincent Lu
05/07/2025, 4:04 AMChris
05/07/2025, 5:27 AMKevin Cui (Black-Hole)
05/07/2025, 1:31 PMbentoml.models.HuggingFaceModel
currently does not support setting repo_type
?
In our scenario, we need to use the lukbl/LaTeX-OCR
model (repo_type="space"
), but currently, there is no way to modify it. My current approach is to manually upload it using bentoml models push
so that the service can access it.Kevin Cui (Black-Hole)
05/08/2025, 6:36 AMArnault Chazareix
05/09/2025, 2:48 PMBENTO is the serving target, it can be the import as:
- the import path of a 'bentoml.Service' instance
- a tag to a Bento in local Bento store
- a folder containing a valid 'bentofile.yaml' build file with a 'service' field, which provides the import path of a 'bentoml.Service' instance
- a path to a built Bento (for internal & debug use only)
Serve from a bentoml.Service instance source code (for development use only): 'bentoml serve fraud_detector.py:svc'
What is the risk of serving from an import path to a bentoml.Service when the app is properly containerized in a dockerfile ?
Thanks for your help