GitHub
06/06/2025, 3:33 AMbentoml.legacy
while keeping the references in bentoml
for a while.
Any references to bentoml.<legacy_api>
will emit a deprecation warning to let users migrate ASAP.
bentoml/BentoMLGitHub
06/09/2025, 12:59 AM<https://github.com/bentoml/BentoML/tree/main|main>
by frostming
<https://github.com/bentoml/BentoML/commit/17a160cd059141852b390795f8f39da40deeda93|17a160cd>
- refactor: move legacy APIs to a separate module (#5381)
bentoml/BentoMLGitHub
06/11/2025, 4:44 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/13/2025, 12:23 AM<https://github.com/bentoml/BentoML/tree/main|main>
by frostming
<https://github.com/bentoml/BentoML/commit/913549cc2595f690fcfd31025cfb1969f30dce7e|913549cc>
- feat: support custom service start command (#5382)
bentoml/BentoMLGitHub
06/13/2025, 4:40 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/13/2025, 6:44 AM<https://github.com/bentoml/BentoML/tree/main|main>
by frostming
<https://github.com/bentoml/BentoML/commit/409a9d8829ab9f4ca02f1852bf21ec66b2ad82c4|409a9d88>
- fix: better way to set service name (#5383)
bentoml/BentoMLGitHub
06/16/2025, 8:50 AMGitHub
06/17/2025, 10:51 AMGitHub
06/17/2025, 9:59 PM<https://github.com/bentoml/BentoML/tree/main|main>
by aarnphm
<https://github.com/bentoml/BentoML/commit/dce5d3adbb79dd9f5beafdc21b11557f1db911d4|dce5d3ad>
- chore(config): export accelerator literal type (#5384)
bentoml/BentoMLGitHub
06/19/2025, 1:28 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/19/2025, 5:09 AM<https://github.com/bentoml/BentoML/tree/main|main>
by aarnphm
<https://github.com/bentoml/BentoML/commit/0fc57118d2a3b6bc9de8c286fe71cab0d97de9b1|0fc57118>
- fix: accept bento type as the bento argument for deployment APIs (#5385)
bentoml/BentoMLGitHub
06/19/2025, 3:22 PM[2025/06/19 14:19:35] [error] [input:prometheus_scrape:prometheus_scrape.0] error decoding Prometheus Text format
The issues seems to come from the order of histogram metrics.
All the _sum keys are at the begginning of the metric, then the _buckets and _count
### To reproduce
1. Deploy a basic Bentoml container with metrics enabled
2. Install fluent-bit (brew install fluent-bit
on macos)
3. Create a basic configuration: fluent-bit.conf
[SERVICE]
Flush 2
Log_level debug
Daemon off
HTTP_Server on
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
[INPUT]
Name prometheus_scrape
Tag local_metrics
Scrape_interval 2s
Host localhost
Port 8080
Metrics_path /test-metrics.txt
[OUTPUT]
Name stdout
Match *
Format json_lines
1. create a test-metrics.txt file with the content of the metrics below
2. launch a basic http server python3 -m http.server 8080
3. launch fluent-bit : fluent-bit -c fluent-bit.conf
Content of the test-metrics.txt file working :
# HELP prediction_time_seconds Time taken for predictions
# TYPE prediction_time_seconds histogram
prediction_time_seconds_sum{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs"} 56.312395095825195
prediction_time_seconds_sum{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs"} 2.419936180114746
prediction_time_seconds_sum{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs"} 0.5229167938232422
prediction_time_seconds_sum{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs"} 4.157390356063843
prediction_time_seconds_sum{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs"} 8.153648376464844
prediction_time_seconds_sum{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs"} 0.32573604583740234
prediction_time_seconds_sum{company_id="e3874ca4-3ea0-46d7-8e8c-359065b0fab9",endpoint="predict_process_collection_and_costs"} 1.031454086303711
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="0.5"} 219.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="1.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="2.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="5.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="10.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="30.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="60.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="+Inf"} 220.0
prediction_time_seconds_count{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs"} 220.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="0.5"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="1.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="2.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="5.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="10.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="30.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="60.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="+Inf"} 8.0
prediction_time_seconds_count{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs"} 8.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="0.5"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="1.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="2.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="5.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="10.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="30.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="60.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="+Inf"} 2.0
prediction_time_seconds_count{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs"} 2.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="0.5"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="1.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="2.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="5.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="10.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="30.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="60.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoin…
bentoml/BentoMLGitHub
06/19/2025, 8:37 PMGitHub
06/20/2025, 12:23 AM<https://github.com/bentoml/BentoML/tree/main|main>
by aarnphm
<https://github.com/bentoml/BentoML/commit/d606ffc2ba37a895e66e65d65165f7b21201d97f|d606ffc2>
- chore: return early python_packages (#5387)
bentoml/BentoMLGitHub
06/23/2025, 7:43 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/23/2025, 7:52 AM<https://github.com/bentoml/BentoML/tree/main|main>
by Sherlock113
<https://github.com/bentoml/BentoML/commit/66ec5cfe430e063cdf208af87bed7bec18fe7fec|66ec5cfe>
- docs: Update adaptive batching example (#5388)
bentoml/BentoMLGitHub
06/23/2025, 9:42 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/23/2025, 3:27 PM<https://github.com/bentoml/BentoML/tree/main|main>
by aarnphm
<https://github.com/bentoml/BentoML/commit/da137d23c98016941f2566d7ff6268e787f646e7|da137d23>
- feat: reading bento args from a YAML file (#5389)
bentoml/BentoMLGitHub
06/24/2025, 6:03 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/24/2025, 6:38 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/24/2025, 7:33 AM<https://github.com/bentoml/BentoML/tree/main|main>
by Sherlock113
<https://github.com/bentoml/BentoML/commit/94a3728d1db5b66f4e2231dec3608d8ddec0f2e4|94a3728d>
- docs: Add --arg-file flag (#5391)
bentoml/BentoMLGitHub
06/24/2025, 11:25 AMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/24/2025, 9:33 PM<https://github.com/bentoml/BentoML/tree/main|main>
by aarnphm
<https://github.com/bentoml/BentoML/commit/1b81ddd7da6b41fb3466a2f80ce0aa8cea6bb251|1b81ddd7>
- fix: Adjust keras version comparison (#5392)
bentoml/BentoMLGitHub
06/25/2025, 2:49 PMself
to access the service
return f"Hello {self.name}"
However, bigger FastAPI applications can be modularized into multiple routers, as described here.
I'm not sure if it's possible, but It would be nice to bind a BentoML service to a specific router. Something like:
from fastapi import APIRouter, Depends, HTTPException
from my_app.auth import get_token_header
router = APIRouter(
prefix="/inference",
tags=["inference"],
dependencies=[Depends(get_token_header)],
)
@bentoml.service
@bentoml.asgi_app_router
class MyService:
name = "MyService"
@router.get('/hello')
def hello(self): # Inside service class, use self
to access the service
return f"Hello {self.name}"
Then, you add it to the main app like this:
from fastapi import Depends, FastAPI
from my_app.routers import inference
app = FastAPI()
app.include_router(inference.router)
Thanks!
### Motivation
With @bentoml.asgi_app
it is currently possible to integrate BentoML with ASGI applications. However, it is difficult to modularize the code and bind BentoML services to specific routers.
### Other
No response
bentoml/BentoMLGitHub
06/25/2025, 11:22 PMGitHub
06/25/2025, 11:23 PM<https://github.com/bentoml/BentoML/tree/main|main>
by sauyon
<https://github.com/bentoml/BentoML/commit/ad0d7142db572a65c1af1651690a4141f15908ab|ad0d7142>
- chore: update AWS cloudformation template (#5394)
bentoml/BentoMLGitHub
06/26/2025, 7:07 AMbento: jtest:ums23fazbcsawiru
name: jtest-lp3b
access_authorization: false
secrets: []
envs: []
services:
Jtest:
instance_type: cpu.small
envs: []
scaling:
min_replicas: 0
max_replicas: 1
policy:
scale_up_stabilization_window: 0
scale_down_stabilization_window: 600
config_overrides:
traffic:
timeout: 60
external_queue: false
deployment_strategy: RollingUpdate
A:
instance_type: cpu.small
envs: []
scaling:
min_replicas: 0
max_replicas: 1
policy:
scale_up_stabilization_window: 0
scale_down_stabilization_window: 600
config_overrides:
traffic:
timeout: 60
external_queue: false
deployment_strategy: RollingUpdate
cluster: default
canary:
route_type: header
route_by: X-Header
versions:
A:
bento: jtest:fglup3qfm6hseiru
weight: 50
services:
Jtest:
instance_type: cpu.small
envs: []
scaling:
min_replicas: 1
max_replicas: 1
policy:
scale_up_stabilization_window: 0
scale_down_stabilization_window: 60
config_overrides:
traffic:
timeout: 60
external_queue: false
deployment_strategy: RollingUpdate
A:
instance_type: cpu.small
envs: []
scaling:
min_replicas: 1
max_replicas: 1
policy:
scale_up_stabilization_window: 0
scale_down_stabilization_window: 60
config_overrides:
traffic:
timeout: 60
external_queue: false
deployment_strategy: RollingUpdate
Fixes #(issue)
## Before submitting:
• Does the Pull Request follow Conventional Commits specification naming? Here are GitHub's guide on how to create a pull request.
• Does the code follow BentoML's code style, pre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• [] Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/26/2025, 9:31 AM<https://github.com/bentoml/BentoML/tree/main|main>
by aarnphm
<https://github.com/bentoml/BentoML/commit/b3f2a1cbc7ba52a0d9a2caa58193d8a12ec661a3|b3f2a1cb>
- docs: Add canary deployment (#5390)
bentoml/BentoMLGitHub
06/26/2025, 1:35 PMpre-commit run -a
script has passed (instructions)?
• Did you read through contribution guidelines and follow development guidelines?
• Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
• Did you write tests to cover your changes?
bentoml/BentoMLGitHub
06/27/2025, 7:00 AM<https://github.com/bentoml/BentoML/tree/main|main>
by Sherlock113
<https://github.com/bentoml/BentoML/commit/ac41a6ff35ef5aa768ee22e31f671804db3447aa|ac41a6ff>
- docs: Add BentoML Sandboxes (#5396)
bentoml/BentoML