https://livekit.io logo
Join Slack
Powered by
# openai-realtime
  • w

    wonderful-cat-44048

    04/09/2025, 9:44 AM
    Hey livekit team want to ask few things - 1 - Does the multimodal agent support silero_vad developed by the livekit because openai vad sucks , so want to use the silero_vad is that released for the multimodal agent for nodejs> 2- Same goes for the krisp noise cancellation does multimodal agent support krisp for nodejs. Please do answer so that I can implement these feature in my project and also please if possible provide the resources or guide how can I implement these 2 things..
    r
    • 2
    • 1
  • w

    wonderful-cat-44048

    04/10/2025, 10:27 AM
    One more thing i want to ask does voicepipeline agent for nodejs support all the things which are present in the python one like for instance silero vad, turn detection, krisp noise cancellation are these present in nodejs also or the nodejs one have less feature as compared to the python one. Please clarify.
    r
    • 2
    • 4
  • s

    straight-continent-20682

    04/10/2025, 10:28 PM
    Hey team ! We are prototyping a new agentic voice feature and have setup a dev sandbox. While trying it out we noticed that the agent is responding and receiving transcriptions from other customers. For example I see completely different long sentences abruptly in the middle that the voice agent responds - sometimes in other languages. I'm assuming its a bug but wanted to give a heads up here.
    r
    • 2
    • 28
  • f

    fancy-glass-72302

    04/12/2025, 7:02 PM
    Hi Team, I'm currently working on a project that uses the following LiveKit-related dependencies: • @livekit/agents: 0.7.2 • livekit-client: 2.9.8 • livekit-server-sdk: 1.2.7 I'm encountering a runtime error when trying to run my application. The error occurs in the
    agent.ts
    file, where it appears that @livekit/agents@0.7.2 expects @livekit/rtc-node@^0.13.4, but the installed version is 0.8.1, which does not export
    ParticipantKind
    . This version mismatch is causing the application to fail. Could you please advise on the correct set of compatible versions for @livekit/agents, livekit-client, and @livekit/rtc-node that should be used together? Alternatively, if there's a known workaround for this specific import error, I would greatly appreciate your guidance. Thank you for your assistance.
    d
    • 2
    • 1
  • f

    freezing-balloon-99666

    04/14/2025, 1:37 AM
    Hi Team, I am currently working on project that uses livekit-agents, livekit-client. I'm facing an issue where my frontend is unable to connect to the same LiveKit URL that I've configured in Render. The connection is not going through from the client side, and I'm wondering if this could be due to a firewall or network configuration issue on the server or platform side. Could you please help check if any inbound/outbound rules or firewall settings might be blocking the connection? Any guidance on required ports or allowed origins for LiveKit would also be helpful.
  • b

    big-minister-46086

    04/16/2025, 5:53 PM
    Hey Team, Can i connect live kit agents to my Supabase vector data base? If i can Do you have any Demo For connecting a vector data base ? Thanks Eyal
    r
    g
    • 3
    • 3
  • t

    thankful-ice-32653

    04/17/2025, 8:41 AM
    Hi Team, I’m currently working on integrating OpenAI Realtime and have implemented dynamic instruction handling from the frontend. Everything works perfectly in my local development environment. However, after deploying the project to Netlify, I’m encountering the following issue:
    The WebSocket connection fails during initialization, and the room connection is aborted before it is established.
    I suspect it may be related to environment-specific configurations or token handling during deployment. Would appreciate your support in resolving this.
  • c

    calm-waitress-37716

    04/27/2025, 12:22 PM
    Hi Guys, Currently exploring How to make AI Tutor S2S. 1. The user will enter a topic name they want to learn 2. They will also enter a approx time they want to learn this for. 3. First, use the ChatGPT text gpt-4o API to create a lesson plan based on the topic and learning time. 4. Then input this lesson plan to the voice api who will have multi turn conversations with the user to teach it - after user presses a button of "Proceed with Voice". 5. Give a button to generate "Revision" text of the discussed topic. 6. Give one more button called "Start Quiz", where based on the topic and the ai + user's voice chat transcript, generates a quiz by gpt-4o text api for 5 or 10 MCQ questions depending on the vastness of the discussed topic. 7. User will reply in a text box with the answers, which are sent to gpt-4o as a reply which grades the answers and provides a result. 8. One more button will come after quiz is graded, that discusses in voice the mistakes user made during the quiz attempt.
    r
    • 2
    • 1
  • c

    calm-shampoo-47633

    04/28/2025, 11:13 AM
    Hey Team, I'm working on a call assistant project using the LiveKit multimodal agent. I’m trying to use a voice from the ElevenLabs voice library instead of the default voices from OpenAI's real-time API. If anyone knows how to set this up, please let me know!
    r
    • 2
    • 4
  • a

    acoustic-secretary-82257

    04/30/2025, 5:29 AM
    Greetings, I'm trying to migrate a Python bot from 0.xx to 1.xx and got most things working, except metrics, which was working before. For some reason it seems I never get a metrics update event? So maybe I'm missing something? Running locally at the moment with: • Livekit 1.8.4 (windows amd64) • livekit 1.0.6 • livekit-agents 1.0.17 And using openai.realtime.RealtimeModel.with_azure I am registering with @_session_.on("metrics_collected"), but it never gets fired, while others like for example @_session_.on("conversation_item_added") work fine. Anyone knows what I could be missing here or if it actually is not working? Other than that, great work team!
    g
    • 2
    • 7
  • c

    calm-shampoo-47633

    04/30/2025, 6:24 AM
    I'm building an AI call assistant using LiveKit's real-time model and OpenAI's real-time API. I'm trying to reduce latency and integrate ElevenLabs voice directly into the real-time model (not the voice pipeline). If anyone has experience with this or any suggestions, I'd really appreciate your help!
  • s

    some-breakfast-27057

    04/30/2025, 9:44 AM
    Hello Team, We successfully integrated open ai realtime with livekit, we manage to stream the text and voice simultaneously. But we do not get the timestamp information, which make it a bit challenging to build a lipsync fully synchronised with our 3d mascott. Any chance that there is a param. we missed to get the timestamp to ?
    r
    • 2
    • 84
  • s

    some-breakfast-27057

    05/01/2025, 1:09 PM
    Hey we are using open ai realtime voice to voice with livekit we are getting the full answer with the event "response_done" but we are not seeing the text_delta that will allow us to get the chunked stream
    g
    • 2
    • 3
  • c

    calm-shampoo-47633

    05/04/2025, 8:41 AM
    I'm using the real-time model for my AI Assistant using openai realtime API, so I couldn't directly use STT/TTS. Even though I tried these methods, they didn't work — the system still uses the default voice. I've shared my code below; please let me know the correct way to implement it. session = AgentSession( llm=openai.realtime.RealtimeModel( voice="verse", turn_detection=TurnDetection( type="server_vad", eagerness="auto", create_response=True, interrupt_response=True, ), ), tts=elevenlabs.TTS(api_key=os.getenv("ELEVENLABS_API_KEY"), voice_id="xnx6sPTtvU635ocDt2j7", model="eleven_flash_v2_5") )
    f
    • 2
    • 1
  • b

    better-cartoon-57310

    05/05/2025, 9:39 AM
    Hi openai realtime isn't letting me disable interruptions with "interrupt_response": False and I'm doing it exactly as it says on the docs
    👀 1
  • b

    bitter-bird-28505

    05/10/2025, 5:36 AM
    Hi Team, I'm working on modifying the OpenAI real-time example to show only the LLM responses in the chat, without the speech output. However, when I try disabling
    audio_output
    using the room output options, the LLM responses no longer appear in the chat either. Is there a way to disable text-to-speech while still displaying the LLM responses in the chat?
    n
    • 2
    • 1
  • c

    clean-byte-14528

    05/10/2025, 7:08 PM
    Hey I keep running into the following for subsquent tool invocations:
    Copy code
    2025-05-10 21:03:53,733 - ERROR mcp.client.sse - Error in sse_reader: peer closed connection without sending complete message body (incomplete chunked read) {"pid": 9119, "job_id": "AJ_brQrxo6TpyYr"}
    2025-05-10 21:04:06,358 - DEBUG livekit.agents - executing tool {"function": "<tool-name>", "arguments": "{}", "speech_id": "speech_71cd2be10451", "pid": 9119, "job_id": "AJ_brQrxo6TpyYr"}
    2025-05-10 21:04:06,359 - INFO mcp-agent-tools - Invoking tool 'get_net_worth' with args: {} {"pid": 9119, "job_id": "AJ_brQrxo6TpyYr"}
    I am using this as a reference and client to connect to my mcp server: https://github.com/livekit-examples/basic-mcp Any one have any experience?
  • w

    worried-petabyte-38885

    05/10/2025, 8:44 PM
    Is there a way to enable and disable interruptions in the voice pipeline model? For VOICE PIPELINE AGENT and have allow_interruptions enabled for 1 min and then disable for 1 min and again enable for 1min. @refined-appointment-81829
  • m

    mammoth-cricket-59568

    05/13/2025, 5:14 PM
    can we use
    Copy code
    gpt-4o-transcribe
    as the model for
    inputAudioTranscription
    yet? i'm finding
    whisper-1
    to be really poor for transcription 🙂
    d
    • 2
    • 2
  • n

    nutritious-umbrella-303

    05/18/2025, 8:42 AM
    Is it possible to preserve full voice chat history (context) between Agents that use OpenAI Realtime model in a multiagent workflow? I want the second agent to have not just text chat history but also all the nuances how user spoke (emotions, pronunciation, etc).
    r
    c
    • 3
    • 18
  • m

    microscopic-tiger-52909

    05/19/2025, 4:14 AM
    @refined-appointment-81829 hey do you have an update on 1.0 Agents for NodeJs? it's a while that it came out for python. I'm trying to understand whether node is going to be update / maintained as much as python. Any estimation on when this will be released is great as it will help many companies plan. For example would it be available in 2025 at all? cc @dry-elephant-14928
    r
    a
    d
    • 4
    • 7
  • a

    alert-pizza-92492

    05/21/2025, 9:58 PM
    I’m building a Voice AI agent. And I’ve got it built with realtime api and voice and tools are working perfectly. I can’t figure out how to to set up a toggle between text and chat, essentially like how it operates in console. Any ideas?
  • n

    nutritious-umbrella-303

    05/22/2025, 11:13 PM
    5 seconds timeout is hardcoded in realtime_session.generate_reply It could be a problem if generate_reply calls a function tool that takes time (for example updating instructions). Could you please make this parameter configurable? Or what are better solutions for my situation? livekit/plugins/openai/realtime/realtime_model.py:
    Copy code
    @dataclass
    class _CreateResponseHandle:
        instructions: NotGivenOr[str]
        done_fut: asyncio.Future[llm.GenerationCreatedEvent]
        timeout: asyncio.TimerHandle | None = None
    
        def timeout_start(self) -> None:
            if self.timeout or self.done_fut is None or self.done_fut.done():
                return
    
            def _on_timeout() -> None:
                if not self.done_fut.done():
                    self.done_fut.set_exception(llm.RealtimeError("generate_reply timed out."))
    
            #                                                  ↓↓↓
            self.timeout = asyncio.get_event_loop().call_later(5.0, _on_timeout)
            self.done_fut.add_done_callback(lambda _: self.timeout.cancel())
  • a

    acoustic-secretary-82257

    05/28/2025, 2:16 PM
    I'm currently looking at getting the transcripts from RealTime API using the "conversation_item_added" and then getting the message and role from event.item Now this does work perfectly fine when using OpenaIA as " openai.realtime.RealtimeModel() ", but when I am using Azure OpenAI as "openai.realtime.RealtimeModel.with_azure() " I only get events on the bot side and am missing the user transcriptions. Anyone got an idea why and how to fix? I am using v1.x (should be latest on moment of writing, cannot verify)
    • 1
    • 1
  • e

    enough-country-47784

    06/01/2025, 2:45 PM
    I get this issue everytime the agent transfers to another one. Is there something I'm doing wrong?
    Copy code
    2025-06-01 08:27:07,172 - WARNING livekit.plugins.openai - received text-only response from realtime API 
    2025-06-01 08:27:07,257 - WARNING livekit.plugins.openai - trying to recover from text-only response {"retries": 1}
    2025-06-01 08:27:08,012 - WARNING livekit.plugins.openai - trying to recover from text-only response {"retries": 2}
    2025-06-01 08:27:08,892 - WARNING livekit.plugins.openai - trying to recover from text-only response {"retries": 3}
    2025-06-01 08:27:09,649 - WARNING livekit.plugins.openai - trying to recover from text-only response {"retries": 4}
    2025-06-01 08:27:10,398 - WARNING livekit.plugins.openai - trying to recover from text-only response {"retries": 5}
    2025-06-01 08:27:11,242 - ERROR livekit.plugins.openai - failed to recover from text-only response {"retried_times": 5}
    2025-06-01 08:27:11,892 - ERROR livekit.agents - Error in _realtime_reply_task
    👀 1
  • s

    silly-ice-71657

    06/05/2025, 5:59 AM
    Hello, Team I just start my voice agent using OpenAI realtime S2S model, My Simple code which is from voice agent quick start guide is showing me this error , .plugins.openai.realtime.realtime_model.RealtimeModel' error=APIConnectionError('OpenAI S2S connection closed unexpectedly') recoverable=False 2025-06-05 105209,018 - ERROR livekit.agents - AgentSession is closing due to unrecoverable error livekit.agents._exceptions.APIConnectionError: OpenAI S2S connection closed unexpectedly 2025-06-05 105209,018 - ERROR livekit.plugins.openai - Error in _recv_task Traceback (most recent call last): File "P:\ALLIED BANK Professional DATA\ABL MUAWIN CUSTOMER_PRODUCTION CODE\TEST_VOICE_AGENT\agent_voice\venv\Lib\site-packages\livekit\agents\utils\log.py", line 16, in async_fn_logs return await fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "P:\ALLIED BANK Professional DATA\ABL MUAWIN CUSTOMER_PRODUCTION CODE\TEST_VOICE_AGENT\agent_voice\venv\Lib\site-packages\livekit\plugins\openai\realtime\realtime_model.py", line 581, in _recv_task raise error Exception: OpenAI S2S connection closed unexpectedly What is the Issue, why facing this error opently
    1️⃣ 1
    2️⃣ 1
  • n

    nutritious-umbrella-303

    06/05/2025, 8:08 AM
    OpenAI has just released an update for their realtime 4o mode: https://platform.openai.com/docs/models/gpt-4o-realtime-preview "We just released an updated snapshot of our speech-to-speech model, now available as gpt-4o-realtime-preview-2025-06-03 in the Realtime API and gpt-4o-audio-preview-2025-06-03 in the Chat Completions API. This update addresses top pieces of user feedback: the model follows instructions more reliably, handles interruptions better, and makes tool calls more consistently. We’d love to hear what you think, especially if these areas have been frustrating in the past. What else would you like to see improved in our speech-to-speech models? Please share your thoughts in this Dev Community thread."
    🙌 5
  • l

    loud-park-66193

    06/19/2025, 7:23 PM
    Hi anyone experiencing this issue, same code base was working this morning
    Copy code
    - INFO livekit.agents - registered worker {"id": "AW_hdt4bkgLCnTR", "url": "<wss://XXXXXX-XXXX.livekit.cloud>", "region": "France", "protocol": 16}
    - INFO livekit.agents - received job request {"job_id": "AJ_jDwdep5K69Eh", "dispatch_id": "", "room_name": "thread-_+XXXXXXXXXFHvijNdHGBY", "agent_name": "", "resuming": false}
    - INFO livekit.agents - initializing process {"pid": 68620}
    - INFO livekit.agents - process initialized {"pid": 68620, "elapsed_time": 0.9}
    - DEBUG asyncio - Using selector: KqueueSelector {"pid": 68620, "job_id": "AJ_jDwdep5K69Eh"}
    - DEBUG livekit.agents - http_session(): creating a new httpclient ctx {"pid": 68620, "job_id": "AJ_jDwdep5K69Eh"}
    - DEBUG livekit.agents - start reading stream {"participant": "sip_+33662230502", "source": "SOURCE_MICROPHONE", "pid": 68620, "job_id": "AJ_jDwdep5K69Eh"}
    - ERROR livekit.agents - Error in _realtime_reply_task
    ....
    
    livekit.agents.llm.realtime.RealtimeError: generate_reply timed out. {
  • a

    ancient-pager-75817

    06/25/2025, 6:47 PM
    If I'm using Gemini Real-Time or any other real-time API when I'm running agent.py locally, is there a way to interact with the agent with text and not just voice?
    j
    • 2
    • 1
  • w

    wooden-morning-31190

    07/02/2025, 8:08 AM
    do you guys support the new gemini live with function calls? i didn't see function calls as an option in the docs.
    r
    • 2
    • 1