Sherlock Xu
02/20/2025, 1:38 PMbentoml code
command for creating a Codespace
◦ Auto-sync of local changes to the cloud environment
◦ Access to a variety of powerful cloud GPUs
◦ Real-time logs and debugging through the cloud dashboard
◦ Eliminate dependency headaches and ensure consistency between dev and prod environments
🐍 New Python SDK for runtime configurations
◦ Added bentoml.images.PythonImage
for defining the Bento runtime environment in Python instead of using bentofile.yaml
or pyproject.toml
◦ Support customizing runtime configurations (e.g., Python version, system packages, and dependencies) directly in the service.py
file
◦ Introduced context-sensitive run()
method for running custom build commands
◦ Backward compatible with existing bentofile.yaml
and pyproject.toml
configurations
⚡ Accelerated model loading
◦ Implemented build-time model downloads and parallel loading of model weights using safetensors to reduce cold start time and improve scaling performance. See the documentation to learn more.
◦ Added bentoml.models.HuggingFaceModel
for loading models from HF. It supports private model repositories and custom endpoints
◦ Added bentoml.models.BentoModel
for loading models from BentoCloud and the Model Store
🌍 External deployment dependencies
◦ Extended bentoml.depends()
to support external deployments
◦ Added support for calling BentoCloud Deployments via name or URL
◦ Added support for calling self-hosted HTTP AI services outside BentoCloud
⚠️ Legacy Service API deprecation
◦ The legacy bentoml.Service
API (with runners) is now officially deprecated and is scheduled for removal in a future release. We recommend you use the @bentoml.service
decorator.
Note that:
• 1.4
remains fully compatible with Bentos created by 1.3
.
• The BentoML documentation has been updated with examples and guides for 1.4
.
🙏 As always, we appreciate your continued support!