https://bentoml.com logo
Join Slack
Powered by
# ai-ml-everything
  • a

    Arihant Hirawat

    10/20/2024, 6:27 AM
    Hi BentoML Community, We’ve developed an HTTP API boilerplate for serving ML models with BentoML. 🚀 Through our experience deploying multiple models as API services, we identified several common tasks that are frequently needed. To streamline the process, we decided to create an open-source boilerplate that simplifies deploying any ML model as an HTTP API server. This boilerplate comes with the following features included: • 📂 Project structure • 🔐 JWT authentication • ☁️ Model download from S3 • ✅ Unit tests • 📝 Structured logging • 📊 Monitoring • 🔗 DynamoDB integration • 🔄 CI/CD workflow for building and deploying • 🔒 Essential security measures for public API exposure We hope this helps others in the community by reducing setup time and improving efficiency when deploying models. Feel free to check it out and contribute! 🙌 Repo: https://github.com/infraspecdev/bentoml-template
    🎉 6
    👍 4
    j
    • 2
    • 1
  • s

    Sherlock Xu

    10/23/2024, 12:39 AM
    Hello everyone! If you want to create an LLM agent app, read our blog post to see how you can build one with LangGraph and BentoML https://www.bentoml.com/blog/deploying-a-langgraph-agent-application-with-an-open-source-model
  • s

    Syed Sadath

    10/31/2024, 2:45 PM
    Hi just stepped into bentoml , My usecase is building rags . Any blog post that covers all the general features bentoml offers ?
    s
    • 2
    • 4
  • s

    Sherlock Xu

    11/04/2024, 12:42 AM
    Hi everyone! See this article to explore some top open-source embedding models https://www.bentoml.com/blog/a-guide-to-open-source-embedding-models
  • a

    Akshat Sharma

    11/16/2024, 7:14 AM
    🚀 Exploring the Impact of AI & Cybersecurity on Digital Payment Systems 💳🤖 In today’s fast-evolving digital landscape, the integration of AI and robust cybersecurity is transforming how we think about digital payments. From enhanced fraud prevention to seamless transactions, these technologies are reshaping the future of financial systems. Check out my latest blog where I dive into the crucial role AI and cybersecurity play in securing digital payment ecosystems and ensuring a safer online transaction experience. 🔐💡 👉 https://medium.com/@akshat111111/the-impact-of-ai-and-cybersecurity-on-digital-payment-systems-3c93f1a2c35a
  • r

    Ritabrata Maiti

    11/18/2024, 6:27 AM
    I’ve been working on AnyModal, a framework for integrating different data types (like images and audio) with LLMs. Existing tools felt too limited or task-specific, so I wanted something more flexible. AnyModal makes it easy to combine modalities with minimal setup—whether it’s LaTeX OCR, image captioning, or chest X-ray interpretation. You can plug in models like ViT for image inputs, project them into a token space for your LLM, and handle tasks like visual question answering or audio captioning. It’s still a work in progress, so feedback or contributions would be great. GitHub: https://github.com/ritabratamaiti/AnyModal
  • s

    Sherlock Xu

    11/19/2024, 12:10 PM
    Hello everyone! Check out our blog post to build a multi-agent app with CrewAI and BentoML: https://www.bentoml.com/blog/building-a-multi-agent-system-with-crewai-and-bentoml
  • s

    Sherlock Xu

    11/25/2024, 1:14 PM
    Hello everyone! Read our tutorial to serve AI21’s Jamba 1.5 Mini https://www.bentoml.com/blog/deploying-ai21-jamba-1-5-mini-with-bentoml! Note that you can also self-host it with OpenLLM!
    r
    • 2
    • 2
  • t

    Toke Emil Heldbo Reines

    11/27/2024, 12:39 PM
    How do you guys handle quality insurance of your models before shipping to production? We track metrics in mlflow and register promising models in the model registry. We then use those models in the bentoml service when building and containerizing. The build/containerization itself is handled by jenkins that builds+containerizes and pushes to AWS ECR. I'd like to have a QA step either before pushing, or before staging the containers for production someway. I imagine having some curated data samples with exact expected outputs and then some larger dataset with overall metrics like AUC, average precision etc where I expect the model to overall perform well - maybe even some warning mechanism showing where any new model performs worse or better than previous models. What have you guys made to handle this step or quality test the final built model/bento?
    c
    • 2
    • 5
  • s

    Sherlock Xu

    12/03/2024, 4:54 AM
    Hi everyone! See this case study https://www.bentoml.com/blog/neurolabs-faster-time-to-market-and-save-cost-with-bentoml to learn how BentoML helps Neurolabs accelerate its AI journey 🚀
  • s

    Sherlock Xu

    12/06/2024, 6:56 AM
    Happy Friday everyone! Read our blog post to see how our new feature BentoML Codespaces solves challenges in developing AI applications and speed up your iteration cycle 🚀 https://www.bentoml.com/blog/accelerate-ai-application-development-with-bentoml-codespaces
  • s

    Sherlock Xu

    12/18/2024, 2:17 AM
    Hi everyone! Are you working with ComfyUI workflows? 🚀 Convert them into production-ready APIs with our new project comfy-pack! Check out the full post to see how it works! https://www.bentoml.com/blog/comfy-pack-serving-comfyui-workflows-as-apis
    ❤️ 2
  • n

    naga venkata satish kumar seethepalli

    12/18/2024, 7:20 PM
    Hi is there any time line for yatai 2.0
    👀 3
  • a

    Alex

    12/19/2024, 12:47 PM
    Hello, colleagues! I have a question about example with vllm serving: https://docs.bentoml.com/en/latest/examples/vllm.html Actually, I do not understand when this setup makes sense. vllm itself has a many features and advanced design e.g. batched inference, queue of requests etc. So, when wrapping it in bentoML is justified?
    c
    v
    • 3
    • 3
  • s

    Sherlock Xu

    01/03/2025, 1:59 AM
    👋 Happy new year, everyone! Check out a curated list of popular ComfyUI custom nodes and answers to FAQs https://www.bentoml.com/blog/a-guide-to-comfyui-custom-nodes
  • s

    Sherlock Xu

    01/10/2025, 5:36 AM
    Hi everyone! See our blog post with Twilio https://www.twilio.com/en-us/blog/voice-application-conversationrelay-bentoml and learn how you can build a voice AI application with ease 🚀
  • s

    Sherlock Xu

    01/16/2025, 8:49 AM
    Hello everyone! See this blog post to learn about structured decoding in vLLM https://www.bentoml.com/blog/structured-decoding-in-vllm-a-gentle-introduction
  • s

    Sherlock Xu

    01/21/2025, 4:00 AM
    Hi everyone! Check out our latest blog post to deploy ColPali with BentoML https://www.bentoml.com/blog/deploying-colpali-with-bentoml. Ideal for use cases like large-scale document retrieval.
    👍 2
  • s

    Sherlock Xu

    02/08/2025, 1:40 AM
    Happy Friday everyone! 🔍 See our new blog post comparing BentoML and Vertex AI https://www.bentoml.com/blog/comparison-between-vertex-ai-and-bentoml
    💯 1
  • s

    Sherlock Xu

    02/15/2025, 12:42 AM
    Hi everyone! 🚀 See our new blog post if you are looking for private DeepSeek deployment https://www.bentoml.com/blog/secure-and-private-deepseek-deployment-with-bentoml
  • s

    Sherlock Xu

    02/28/2025, 8:29 AM
    Happy Friday everyone! 🚀 Read our new blog post https://www.bentoml.com/blog/building-ml-pipelines-with-mlflow-and-bentoml to build a seamless ML workflow from experimentation to production with MLflow and BentoML.
  • s

    Sherlock Xu

    03/07/2025, 4:46 AM
    Hi everyone! Are you confused about the different versions of DeepSeek? Read our blog post to find the right one for your use case https://www.bentoml.com/blog/the-complete-guide-to-deepseek-models-from-v3-to-r1-and-beyond
  • s

    Sherlock Xu

    03/18/2025, 9:15 AM
    Hi everyone! Is your AI infrastructure slowing you down? 👀 We have identified 6 common pitfalls (https://www.bentoml.com/blog/6-infrastructure-pitfalls-slowing-down-your-ai-progress) that keep AI teams stuck and explain how BentoML fixes them.
  • n

    Noman Saleem

    04/10/2025, 1:08 PM
    Hi, I am sending audio file .wav/.mp3 as form data from postman with file as key. I am unable to get the audio in the endpoint and process it. Need help. BentoML version: 1.4.8 Here is the code:
    Copy code
    @bentoml.api
        def transcribe_audio(self, file ) -> dict:
            audio, _ = librosa.load(file.file, sr=16000)
            input_values = self.audio_processor(audio, return_tensors="pt", padding="longest").input_values
    
            # Perform transcription
            with torch.no_grad():
                logits = self.audio_model(input_values).logits
            predicted_ids = torch.argmax(logits, dim=-1)
            transcription = self.audio_processor.decode(predicted_ids[0])
    
            return {"transcription": transcription}
    Error:
    Copy code
    Traceback (most recent call last):
      File "C:\Users\User\miniconda3\envs\consforc_website\Lib\site-packages\_bentoml_impl\server\app.py", line 604, in api_endpoint_wrapper
        resp = await self.api_endpoint(name, request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\User\miniconda3\envs\consforc_website\Lib\site-packages\_bentoml_impl\server\app.py", line 668, in api_endpoint
        input_data = await method.input_spec.from_http_request(request, serde)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\User\miniconda3\envs\consforc_website\Lib\site-packages\_bentoml_sdk\io_models.py", line 213, in from_http_request
        return await serde.parse_request(request, t.cast(t.Type[IODescriptor], cls))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\User\miniconda3\envs\consforc_website\Lib\site-packages\_bentoml_impl\serde.py", line 224, in parse_request
        data[k] = json.loads(v)
                  ^^^^^^^^^^^^^
      File "C:\Users\User\miniconda3\envs\consforc_website\Lib\json\__init__.py", line 341, in loads
        s = s.decode(detect_encoding(s), 'surrogatepass')
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 4: invalid start byte
    Also --reload does not work on windows
    c
    • 2
    • 2
  • s

    Sherlock Xu

    04/21/2025, 1:09 AM
    Hello everyone! Are you working to tackle cold starts for LLMs? Read about our journey in our latest post https://www.bentoml.com/blog/cold-starting-llms-on-kubernetes-in-under-30-seconds
  • s

    Sherlock Xu

    04/25/2025, 6:16 AM
    Happy Friday! 🚀 See how Yext slashed time-to-market and compute costs with BentoML: https://www.bentoml.com/blog/accelerating-ai-innovation-at-yext-with-bentoml
    🙌 1
  • s

    Sherlock Xu

    04/29/2025, 12:53 PM
    Hello everyone! Read our latest post https://www.bentoml.com/blog/how-to-beat-the-gpu-cap-theorem-in-ai-inference to learn about the GPU CAP Theorem and how BentoML can help enterprises solve it 🚀
    👀 1
  • s

    Sherlock Xu

    05/09/2025, 6:23 AM
    Happy Friday! Read our post to learn how to deploy and scale Phi-4-reasoning https://www.bentoml.com/blog/deploying-phi-4-reasoning-with-bentoml
  • s

    Sherlock Xu

    06/12/2025, 12:47 AM
    Hi everyone! Read our blog post to learn the latest tech about distributed LLM inference https://www.bentoml.com/blog/the-shift-to-distributed-llm-inference
  • x

    Xiuyu Yang

    06/23/2025, 2:21 PM
    Hi, everyone! We're delighted to share our latest work on using interleaved autoregression scheme in MLLM into self-driving simulations, welcome to our project page:https://orangesodahub.github.io/InfGen/, also welcome to star our git repo: https://github.com/OrangeSodahub/infgen, codes and models are coming soon!
    🏁 1
    🍱 1
    👀 2