Sean
02/20/2024, 4:00 PMv1.2
, the biggest release since the launch of v1.0
. This release includes improvements from all the learning and feedback from our community over the past year. We invite you to read our release blog post for a comprehensive overview of the new features and the motivations behind their development.
Here are a few key points to note before we delve into the new features:
• v1.2
ensures complete backward compatibility, meaning that Bentos built with v1.1
will continue to function seamlessly with this release.
• We remain committed to supporting v1.1
. Critical bug fixes and security updates will be backported to the v1.1
branch.
• BentoML documentation has been updated with examples and guides for v1.2
. More guides are being added every week.
• BentoCloud is fully equipped to handle deployments from both v1.1
and v1.2
releases of BentoML.
⛏️ Introduced a simplified service SDK to empower developers with greater control and flexibility.
• Simplified the service and API interfaces as Python classes, allowing developers to add custom logic and use third party libraries flexibly with ease.
• Introduced @bentoml.service
and @bentoml.api
decorators to customize the behaviors of services and APIs.
• Moved configuration from YAML files to the service decorator @bentoml.service
next to the class definition.
• See the vLLM example demonstrating the flexibility of the service API by initializing a vLLM AsyncEngine in the service constructor and run inference with continuous batching in the service API.
🔭 Revamped IO descriptors with more familiar input and output types.
• Enable use of Pythonic types directly, without the need for additional IO descriptor definitions or decorations.
• Integrated with Pydantic to leverage its robust validation capabilities and wide array of supported types.
• Expanded support to ML and Generative AI specific IO types.
📦 Updated model saving and loading API to be more generic to enable integration with more ML frameworks.
• Allow flexible saving and loading models using the bentoml.models.create
API instead of framework specific APIs, e.g. bentoml.pytorch.save_model
, bentoml.tensorflow.save_model
.
🚚 Streamlined the deployment workflow to allow more rapid development iterations and a faster time to production.
• Enabled direct deployment to production through CLI and Python API from Git projects.
🎨 Improved API development experience with generated web UI and rich Python client.
• All bentos are now accompanied by a custom-generated UI in the BentoCloud Playground, tailored to their API definitions.
• BentoClient offers a Pythonic way to invoke the service endpoint, allowing parameters to be supplied in native Python format, letting the client efficiently handles the necessary serialization while ensuring compatibility and performance.
🎭 We’ve learned that the best way to showcase what BentoML can do is not through dry, conceptual documentation but through real-world examples. Check out our current list of examples, and we’ll continue to publish new ones to the gallery as exciting new models are released.
• BentoVLLM
• BentoControlNet
• BentoSDXLTurbo
• BentoWhisperX
• BentoXTTS
• BentoCLIP
🙏 Thank you for your continued support!