https://bentoml.com logo
Join Slack
Powered by
# ask-for-help
  • t

    Ting Chen

    04/07/2025, 1:36 AM
    Hello team đź‘‹ , I have question regarding the Adaptive Batching feature, based on the official description, I understand it this way, for example, I have some requests coming with only 1 element array, it would help combine some arrays together to process, like make a 2 or 3 elements array and then pass to process, or it has some requests with say 5 elements array, and it may distribute with 2 and 3 elements array requests to process separately based on real-time traffic burden and also the settings (max batch size and max latency)? Is it how it dynamically works? Or there is some misunderstanding around the Adaptive batch?
    Copy code
    BentoML supports adaptive batching, a dynamic request dispatching mechanism that intelligently groups multiple requests for more efficient processing. It continuously adjusts batch size and window based on real-time traffic patterns.
    The reason I ask is because I have set up some load test, and logged out the real batch size received from the inference function, but I don't see any dynamic size coming in. So I am wondering any setup was not correct or I misunderstand this feature. Can anyone help me with this? 🙏
    j
    • 2
    • 9
  • a

    Anthony Guselnikov

    04/07/2025, 7:17 PM
    Hi.. I have this curious issue where BentoML reverts timeout to 60 seconds after the first API call: 2025-04-07T145542-0400 [INFO] [entry_serviceSentenceTransformers1] 127.0.0.1:51128 (scheme=http,method=POST,path=/encode,type=application/json,length=201092) (status=200,type=application/json,length=3425048) 143723.858ms (trace=ee7d3af3e7d7d5e92254467826ea6f50,span=38faf12024132b83,sampled=0,service.name=SentenceTransformers) ..... <-------------- After this call all of them fail bentoml.exceptions.ServiceUnavailable: process is overloaded 2025-04-07T145706-0400 [INFO] [entry_serviceSentenceTransformers1] 127.0.0.1:51302 (scheme=http,method=POST,path=/encode,type=application/json,length=201092) (status=503,type=application/json,length=74) 60017.688ms (trace=4f3ac9e760c18b55a3be7000bacf5600,span=2413b6e7cb2c4293,sampled=0,service.name=SentenceTransformers) ..... bentoml.exceptions.ServiceUnavailable: process is overloaded 2025-04-07T150155-0400 [INFO] [entry_serviceSentenceTransformers1] 127.0.0.1:51596 (scheme=http,method=POST,path=/encode,type=application/json,length=201092) (status=503,type=application/json,length=74) 60005.034ms (trace=22be9ba69e6af428f8e7586232f5390c,span=4749414bffed68e4,sampled=0,service.name=SentenceTransformers)
    j
    f
    • 3
    • 11
  • a

    Anthony Guselnikov

    04/07/2025, 7:18 PM
    I specify timeout to be 300 seconds in multiple places: bentoml serve --timeout 300
    Copy code
    @bentoml.service(
        timeout=300,
        traffic={"timeout": 300},
    )
  • e

    Egmont

    04/09/2025, 5:37 AM
    Hi I have a question about adaptive batching, I'd like to understand how bentoml manages the mapping between requests and response. In the doc, it says the input and output of the batchable api should be a
    List
    , is there are any sort of indexing for identifying the output (e.g. which element in the list should go to which request as response). Or is it relies on the order of the list? Thank you!
    j
    • 2
    • 3
  • p

    Pang Jin Hui

    04/09/2025, 10:52 AM
    Hi I have a question on how to restrict Path content type into multiple types in the service api input spec, for example: pdf and image?
  • m

    Minh Huy VĹ© Nguyá»…n

    04/10/2025, 3:30 AM
    Hi I am working on a project using Bentoml's adaptive batching. Is there a way to force an api to wait for a “min_latency_ms” when the first request arrives even when the service’s worker is available?
    j
    • 2
    • 1
  • a

    Antonio Bevilacqua

    04/14/2025, 9:17 AM
    Hi, I am working on a project that aims to automatize bentoml packaging so I need to be able to condition build dependecies from "outside". I was wondering if there is a way to set dependency specifiers through build options. In other words I want to define different sets of optional dependecies in pyproject.toml and to be able to switch between them using an env variable. Is there a way to do this that I am not seeing? Thanks!
    c
    f
    • 3
    • 6
  • a

    Arnault Chazareix

    04/14/2025, 11:25 AM
    Hi! I have a standardized service that I would like to re-use. Is there a way to do this without recreating / inheriting from the base class with adding the decorator? Example: a
    TokenizeService
    that takes as input a path / key to a tokenizer. I would like my main service to depends on
    TokenizeService("my_tokenizer_1")
    and
    TokenizeService("my_tokenizer_2")
    Thanks for your help and for bentoml 🙂
    c
    • 2
    • 2
  • r

    Rahul Rawat

    04/14/2025, 12:08 PM
    Hey!!! I am building a api using bentoml for production but in docs there is only cli running option. is there any way and code snippets that we can run using code itself. it will help while debugging the codes. Thanks in Advance
    Copy code
    import bentoml
    from dextor.src.api_service import Summarization
    
    if __name__ == "__main__":
        bentoml.serve(Summarization)
    j
    • 2
    • 6
  • s

    Shani Rosen

    04/17/2025, 2:00 PM
    Hey everyone 🙂 I'm trying to utilize the timeout functionality and seen some interesting behavioure: I'm using:
    Copy code
    @bentoml.service(
        traffic={"timeout": 10, "workers": 1},
    )
    then
    Copy code
    @bentoml.api
        def test(self):
            time.sleep(30)
            print("Hello")
            return
    I'm getting indeed a timeout after 10 seconds, but then in the server itself looks like the process is still running and after 30 seconds I see the "Hello", Anyone knows why that happens / how can I kill the process in the case of timeout? Using bentoml 1.4.10 Thanks!
    j
    • 2
    • 3
  • e

    Eddie Dunn

    04/17/2025, 2:58 PM
    Hello! I just submitted a ticket and saw this channel. What is the typical response time for tickets submitted through the web form?
    j
    • 2
    • 6
  • s

    Sachin Kumar

    04/17/2025, 2:58 PM
    Hello, I am looking to build an IT infrastructure monitoring and observability platform with ITSM and other features, if anyone intreted, please message me personally.
  • j

    Jerry Harrow

    04/17/2025, 5:44 PM
    Any details on the fix for on CVE https://github.com/bentoml/BentoML/security/advisories/GHSA-7v4r-c989-xh26 ? Can you point out the fix made in 1.4.8 that resolved it. Looking to mitigate that CVE in our use of bentoml. We are stuck on bentoml==1.1.11 due to our use of openllm==0.4.44 — so need to figure out a workaround.
    c
    • 2
    • 5
  • t

    Takuo Sato

    04/18/2025, 5:16 AM
    Dear everyone, hi. I want to run a LLM model of the default repository of OpenLLM with a Kubernetes cluster. At least, right now, the following is wrong: https://cheatsheet.md/llm-leaderboard/openllm Since OpenLlm don't has the build command: ubuntu@ip-10-0-21-129:~$ openllm build gemma3 --model-id gemma3/gemma3:1b Usage: openllm [OPTIONS] COMMAND [ARGS]... Try 'openllm -h' for help. ╭─ Error ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ No such command 'build'. │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ How should I do? I'm glad if you answer it.
    j
    • 2
    • 4
  • i

    Idan

    04/18/2025, 9:29 AM
    I've signed up on BentoCloud to give it a go with a custom setup using the 10$USD credits. I left yesterday when my image had failed to build (so there were no active deployments), but today when I logged on, I see that my credits are missing and I can't continue iteration. Why was the account charged if nothing was running?
    j
    • 2
    • 2
  • n

    Neural Manacle

    04/20/2025, 6:03 AM
    👋 Hello, team! i'm noishey. i'm in the process of building a tool that generates gradients based on text prompt. i've been searching for the right model which is opensource as this tool is built lean. stable horde is an option but at peak times it fails to deliver. i'm wondering if there is a light open source model that can be hit for free. any info will help me. my DM's are open.🍰
  • g

    Guy Melul

    04/24/2025, 12:38 PM
    Hi, Bento Team I was wondering if Docker support is planned for ComfyPack any time soon? I have been interested in deploying a few workloads on our On-Prem K8S
    c
    • 2
    • 2
  • p

    Pierre Lecerf

    04/28/2025, 8:21 AM
    Hello đź‘‹ I'm wondering how you typically deal with transitive dependencies relying on Rust. I often find myself having projects that rely on libraries with, let's say, poor dependency hygiene, and so they add transitive dependencies that haven't been maintained for years and are missing wheels for "recent" Python versions. As a result they require
    cargo
    & cie at build time, which is not a part of the default
    debian
    image BentoML uses, and many other base images as a matter of fact. How do you deal with these? Do you systematically override the Dockerfile template? Simply using
    system_packages
    ends-up bloating the final image right?
    c
    • 2
    • 1
  • o

    Oded Valtzer

    04/28/2025, 8:24 PM
    hi - we have a failing deployment which shows stuck in a CrashLoopBackOff - redeploy don't work either, and we can't see anything bad in the logs... how can we debug this?
    d
    j
    j
    • 4
    • 8
  • z

    Zuyang Liu

    04/29/2025, 4:35 AM
    Following the hello-world example, making a deployment, however stuck at consistent “image building”, any idea why? Meanwhile in the local terminal I’m getting:
    Copy code
    image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
    time=2025-04-29T04:32:07.876Z level=INFO source=main.go:1355 msg="bento layer has been pushed to image registry" context-path=/workspace/buildcontext 
    bucket=bc-mt-guc1-infra 
    image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
    time=2025-04-29T04:32:07.876Z level=INFO source=main.go:1357 msg="pushing image metadata to image registry..." context-path=/workspace/buildcontext 
    bucket=bc-mt-guc1-infra 
    image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
    time=2025-04-29T04:32:08.036Z level=INFO source=main.go:1362 msg="image metadata has been pushed to image registry" context-path=/workspace/buildcontext 
    bucket=bc-mt-guc1-infra 
    image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
    time=2025-04-29T04:32:08.036Z level=INFO source=main.go:1364 msg="successfully pushed image to image registry" context-path=/workspace/buildcontext 
    bucket=bc-mt-guc1-infra 
    image=us-central1-docker.pkg.dev/bentoml-prod/bc-mt-guc1/bento-images:yatai.org-strella--gcp-us-central-1.summarization.4ah7mwrewkyuweut.nomodels.s3
    + exit 0
    j
    • 2
    • 7
  • m

    Matěj Šmíd

    04/29/2025, 9:27 PM
    on
    bentoml build
    I end up with:
    Copy code
    shutil.Error: Destination path '/home/matej/bentoml/tmp/tmptvzmstoibentoml_bento_equivision/env/python/wheels/mmdet-3.3.0.tar.gz' already exists
    The sdist is made twice and fails after second round: https://gist.github.com/smidm/2c1f914ce5289b999a213dd747110f1f Any clue how to diagnose this?
    • 1
    • 1
  • m

    Matěj Šmíd

    04/30/2025, 7:26 PM
    I advanced a bit further. I have issues with packages with compiled extensions now. The packages are mainly git repos. The UV seems to make source tarballs out of the git repos and just installs the python sources without building. The wheel directory seems no longer supported. How to proceed with the packages with compiled extensions?
    j
    f
    • 3
    • 10
  • l

    Liu Muzhou

    05/07/2025, 1:25 AM
    Hi, I just want to know if the https://github.com/bentoml/BentoSGLang example can support auto scaling?
    c
    • 2
    • 2
  • k

    Kevin Cui (Black-Hole)

    05/07/2025, 3:47 AM
    How does BentoML charge after deployment? Does the billing start as soon as it is deployed (for GPU and CPU), or is there no charge if it is not in use after deployment, with billing only starting when GPU or CPU resources are utilized?
    c
    m
    • 3
    • 10
  • v

    Vincent Lu

    05/07/2025, 4:03 AM
    I added input and output nodes to my comfy workflow. After that I tried deploying it. But then I got an error. Where do I find that the pattern mismatch?
    c
    j
    • 3
    • 8
  • v

    Vincent Lu

    05/07/2025, 4:04 AM
    Screenshot 2025-05-07 at 12.02.25 AM.png
  • c

    Chris

    05/07/2025, 5:27 AM
    Hello! I set up an simple image service with "@service" and "@api" decorators. Is there a possibility during bentoml serve, to get the current bento tag of the service? i want to store a result.json file and also but the service tag in it 🙂
    c
    • 2
    • 3
  • k

    Kevin Cui (Black-Hole)

    05/07/2025, 1:31 PM
    Is it expected behavior that
    bentoml.models.HuggingFaceModel
    currently does not support setting
    repo_type
    ? In our scenario, we need to use the
    lukbl/LaTeX-OCR
    model (
    repo_type="space"
    ), but currently, there is no way to modify it. My current approach is to manually upload it using
    bentoml models push
    so that the service can access it.
    j
    x
    f
    • 4
    • 5
  • k

    Kevin Cui (Black-Hole)

    05/08/2025, 6:36 AM
    I noticed a strange issue: when my instance count is 0, the first request I send always returns “input parameter is missing.” However, once there is at least one instance, this problem does not occur. After the first request is sent, the number of instances will expand (from 0 to 1). The second request is sent immediately after the completion of the first request (at this point, the instance has not scaled down and there is still one instance).
    c
    f
    • 3
    • 7
  • a

    Arnault Chazareix

    05/09/2025, 2:48 PM
    Hi 🙂 I am seeing that bentoml serve asks for a built bento rather than a path or other such inputs
    Copy code
    BENTO is the serving target, it can be the import as:
    - the import path of a 'bentoml.Service' instance
    - a tag to a Bento in local Bento store
    - a folder containing a valid 'bentofile.yaml' build file with a 'service' field, which provides the import path of a 'bentoml.Service' instance
    - a path to a built Bento (for internal & debug use only)
    
    Serve from a bentoml.Service instance source code (for development use only): 'bentoml serve fraud_detector.py:svc'
    What is the risk of serving from an import path to a bentoml.Service when the app is properly containerized in a dockerfile ? Thanks for your help