https://livekit.io logo
Join Slack
Powered by
# agents
  • f

    few-oxygen-16360

    11/13/2025, 4:51 PM
    Hey guys! We are self hosting our workers on cloud run and for almost 2 weeks no we are facing some issues regarding the worker status, for some reason the worker is shut down. We can't manage to find if it is coming from our side or from the livekit server directly. cc @refined-appointment-81829
    r
    • 2
    • 14
  • f

    famous-finland-27043

    11/13/2025, 5:43 PM
    @refined-appointment-81829 Something I would love help with, I’m using a LiveKit voice agent to run mock interviews. Now I want to add a second mode (called human-led interviews) where two human users can join a video conference room using the same service flow (same pre-interview and post-interview logic). The issue is that when I create this conference room, the voice agent automatically joins as a participant. I am using explict dispatch for the agent and my entry code looks like this:
    Copy code
    if __name__ == "__main__":
        agents.cli.run_app(
            agents.WorkerOptions(entrypoint_fnc=entrypoint, agent_name="interview-agent")
        )
    Here is what i tried: • passing the agent dispatch config conditionally as part of the token -- agent was still joining the room. • avoiding calling ctx.connect (I noticed thats when the agent joins) -- did not work. My current fix: • Mute the agent and hide its tile in the GridLayout component --not ideal for sure. What is the best way to conditionally choose if the agent should join? Thanks :)
    r
    • 2
    • 4
  • b

    billions-book-73023

    11/13/2025, 6:09 PM
    Looks like model Drax open source speech model got released 32x faster than real time with Wispr level accuracy. This would be a game changer for app I am building. @refined-appointment-81829 @rough-gpu-50664 any plans for built in support for this? https://www.speechtechmag.com/Articles/News/Speech-Technology-News/AiOla-Launches-Drax-Open-Source-Speech-Model-172301.aspx
    r
    • 2
    • 2
  • h

    handsome-nest-90659

    11/13/2025, 6:53 PM
    hello, does anyone know what LIVEKIT_REMOTE_EOT_URL in https://github.com/livekit/agents/blob/main/livekit-plugins/livekit-plugins-turn-detector/livekit/plugins/turn_detector/multilingual.py is for? is it possible to run inference in an external endpoint?
  • b

    bumpy-quill-11391

    11/13/2025, 8:57 PM
    @refined-appointment-81829 should we expect to see gpt 5.1 available in Livekit inference?
    r
    • 2
    • 1
  • s

    steep-balloon-41261

    11/13/2025, 9:42 PM
    This message was deleted.
    n
    • 2
    • 4
  • a

    adventurous-waiter-12836

    11/14/2025, 12:20 AM
    Is the landing page demo available for use? We couldn’t find it on GitHub. We are trying to better understand some of the settings (min silence, preemptive generation, semantic turn detection vs bad bad, etc). Is there anywhere we can get these? And if not can you share them? Our aim is to replicate it’s performance
    l
    • 2
    • 1
  • n

    nutritious-oyster-4341

    11/14/2025, 2:54 AM
    Hi, since Inworld now supports WebSocket-based TTS, can we add the streaming version of their TTS to LiveKit Agents?
  • l

    loud-battery-93295

    11/14/2025, 5:57 AM
    I’ve built one agent successfully, but I’m noticing a delay of around 6–7 seconds before the audio is generated. Could you please help me understand why this might be happening? I’d like to reduce or ideally eliminate the delay for a smoother real-time experience.
    l
    • 2
    • 1
  • f

    faint-afternoon-93060

    11/14/2025, 5:59 AM
    Guys any update on vad settings for bengali voice
  • s

    strong-furniture-4150

    11/14/2025, 6:13 AM
    Hi everyone, i need the best STT and TTS for to pronounce the indian words and i need indian english accent for this. suggest me the best tts and stt please
    m
    • 2
    • 2
  • l

    loud-battery-93295

    11/14/2025, 6:15 AM
    can anyone fix it
  • f

    fancy-motherboard-28624

    11/14/2025, 6:40 AM
    Can we please add support for this on livekit https://elevenlabs.io/docs/cookbooks/text-to-speech/pronunciation-dictionaries
    ➕ 1
    l
    • 2
    • 2
  • c

    crooked-action-53836

    11/14/2025, 7:44 AM
    Hi everyone, has anybody played with realtime models for voice AI instead of STT-LLM-TTS combinations if yes did you see any change in the performance?
  • e

    echoing-lunch-36548

    11/14/2025, 7:55 AM
    Hey, created a PR for supporting the GPT5.1 model which is newly launched. here is the PR https://github.com/livekit/agents/pull/3928
    🙌 1
  • r

    rough-butcher-91877

    11/14/2025, 10:17 AM
    @dry-elephant-14928, we would like to use gpt5.1 in our current livekit implementation. Can we have a release with gpt5.1?
  • e

    echoing-photographer-35589

    11/14/2025, 11:54 AM
    Hi there, good to join to the community. I am a new to this world but willing to progress. So far I suceded to integrate my Lab using Asterisk+ Sip Bridge + redis + LiveKit for learning purposes. Then I borrow code from livekit-voice-agent to test inbound flow ..so far all OK. However now I am trying to implement a more realistic flow: Customer calling to request to know balance and being able to be transferred to fake Technical Support. The inbound is OK but function tool to retrieve customer profile and transfering not OK. During troubleshoot I came across participant_identification not available after room setup , then I figure it out that perhaps I needed to wait for Event. I reviewed documentation and added code for events but seems to be NOT triggered. Any idea where to look for Events triggers within Agent Code ?? This is Code fragment async def entrypoint(ctx: agents.JobContext): await ctx.connect() agent = Assistant() session = AgentSession(llm=openai.realtime.RealtimeModel(voice="coral")) await session.start( room=ctx.room, agent=agent, room_input_options=RoomInputOptions() ) # Handler SIN async, dispara la lógica asíncrona: def on_participant_connected(p): participant_identity = getattr(p, "participant", None) logger.debug(f"ON_PARTICIPANT_JOINED: participant_identity= {participant_identity}") print(f"ON_PARTICIPANT_JOINED: participant_identity= {participant_identity}") asyncio.create_task(handle_real_join(participant_identity, p)) ctx.room.on("participant_connected", on_participant_connected) logger.debug(f"AFTER participant_connected| CTX.room attributes: {dir(ctx.room)}") # This I can see in Logs but there is no participant_identity shown, Also the previous Loggers not there async def handle_real_join(participant_identity, participant_obj): # 1. Perfil y update profile = get_customer_profile(participant_identity) logger.debug(f"HANDLE_REAL_JOIN: participant_identity= {participant_identity} | profile= {profile}") print(f"HANDLE_REAL_JOIN: participant_identity= {participant_identity} | profile= {profile}") agent.update_profile(profile) # 2. Saludo bienvenida = ( f"¡Hola, {profile.get('apellido')}! " f"Su saldo actual es de {profile.get('saldo')}. " f"Tipo de cliente: {get_priority_label(profile.get('priority'))}. " "¿En qué puedo ayudarle hoy?" ) #await session.generate_reply(instructions=bienvenida) # 3. SOLO aquí o en la conversión, tras pedir transferencia, ejecuta la tool: # Si quieres prueba, manda la transferencia aquí: #profile = get_customer_profile('participant_identity') participant = next(iter(ctx.room.remote_participants.values())) identity = participant.identity if participant else "unknown" msg = await agent.transfer_call(identity, ctx.room.name) # await session.generate_reply(instructions=msg) # Ejemplo de uso: si se ejecuta la transferencia, publica el mensaje usando el resultado devuelto #msg = await agent.transfer_call(ctx.room) await session.generate_reply(instructions=msg)
    r
    • 2
    • 4
  • b

    better-house-57730

    11/14/2025, 12:58 PM
    Hi all Livekit team and community 👋 Curious how people are building around Tool calling and MCPs. Specifically about giving agents a dedicated and isolated execution environment to let them write code that calls MCPs or Tools via a generated API. In contrast to the traditional MCP/Tool definition passed to the LLM context window and letting the LLM chose. These two articles explain it very well: • https://blog.cloudflare.com/code-mode/ • https://www.anthropic.com/engineering/code-execution-with-mcp The main benefits on this approach: • Huge (90%+ in some cases according to Anthopic) tokens saved. • Less context bloating for llms • LLMs hallucinate less due to shorter contexts and being much more versed at writing/reading code than MCP definitions. I have 2 questions: • Is Livekit Tool implementation more like traditional MCP? (my understanding is yes) • Has anyone in the community started implementing tool calling like code execution?
    r
    • 2
    • 2
  • l

    lively-minister-89835

    11/14/2025, 1:11 PM
    Hello, I am using qwen3-32B LLM using groq integration, but the tool calling is messed up and agent is speaking this out loud and getting this in transcript. It was working fine with Gemini model tool_call
    Copy code
    {
      "name": "end_call",
      "arguments": {}
    }
    /tool_call This is my end_call tool:
    Copy code
    @function_tool()
    async def end_call(context: RunContext) -> dict:
        """
        Ends the call after user signals goodbye or task completion or
        the user is not interested in the conversation.
        """
        await context.session.generate_reply(
            instructions="Close the call politely based on the conversation history"  # noqa: E501
        )
        # Wait for any current speech to complete
        current = context.session.current_speech
        if current:
            await current.wait_for_playout()
    
        # Delete the LiveKit room to hang up for everyone
        job_ctx = get_job_context()
    
        # Stop active egress jobs
        request = ListEgressRequest(room_name=job_ctx.job.room.name, active=True)
        active = await job_ctx.api.egress.list_egress(request)
        for e in active.items:
            await job_ctx.api.egress.stop_egress(
                StopEgressRequest(egress_id=e.egress_id)
            )
    
        await job_ctx.api.room.delete_room(
            DeleteRoomRequest(room=job_ctx.room.name)
        )
        return {"ended": True}
    b
    • 2
    • 3
  • f

    full-guitar-90150

    11/14/2025, 3:25 PM
    Hi everyone, I’m a beginner software developer and recently started working with n8n workflows. I’ve now transitioned to using LiveKit for voice-based agents in a few portfolio projects. In n8n, I was used to building workflows via drag-and-drop and configurations in the browser. Now, as I’m considering self-hosting with LiveKit, I’m wondering: Is it possible to integrate RAG (retrieval-augmented generation) and custom tools? I got a bit confused by the documentation, which is why I’m reaching out to the community for some clarification. Any feedback would be greatly appreciated! Thanks in advance!
    r
    • 2
    • 13
  • a

    able-butcher-80092

    11/14/2025, 4:01 PM
    Hi there. Trying to upload google_credentials.json as said in the docs, but I get
    Copy code
    unable to deploy agent: secret name contains invalid characters, only uppercase and lowercase letters, numerals, hyphens, and underscores are allowed
    My file is
    google_credentials.json
    and I'm trying with
    lk agent deploy --secret-mount ./google_credentials.json
    am I doing something wrong? I've tried with
    =
    ,
    '
    and so on, to no avail
    r
    • 2
    • 18
  • h

    happy-mouse-590

    11/15/2025, 12:27 AM
    Hi. I currently am self hosting on AWS ECS Fargate. Its all good apart from the issue of when tasks are down scaling, some active sessions might get disrupted. What is a good work around for this? Shift to ECS with EC2 launch type? I want to stay self hosted so any advice within self hosting would be super appreciated!
    p
    • 2
    • 2
  • n

    nutritious-oyster-4341

    11/15/2025, 12:45 AM
    Hi, any plans on releasing livekit agents written in golang?
  • n

    nutritious-oyster-4341

    11/15/2025, 12:45 AM
    and also this
  • b

    big-application-86058

    11/15/2025, 5:40 AM
    Hi, may I ask whether the LiveKit playground KITT uses turn detection? If yes, what type does it use? I tried playing back a recorded audio to it and got interrupted constantly. Please check here:
    test_playground_kitt.mov
  • b

    big-application-86058

    11/15/2025, 5:44 AM
    To me, it looks like the LiveKit playground KITT either isn't using turn detection or it's broken.
  • p

    proud-magician-4503

    11/15/2025, 9:05 AM
    Hi, can anyone tell what context.session.agent_state do as I want to update my agent state, as my agent is not at all working in my flow?
  • p

    powerful-activity-81417

    11/15/2025, 10:41 AM
    Does livekit STT support openai's
    gpt-4o-transcribe-diarize
    model?
  • m

    magnificent-nail-24287

    11/15/2025, 2:21 PM
    Could use some help here. I am trying to solve a larger problem of the child processes spawned by the local worker never getting closed on disconnect. Am I misunderstanding something here? is the intended behavior that these processes are left hanging around despite their never being used again. That doesn't strike me as the case since it warns me that the job is unresponsive. Any help greatly appreciated. On the python sdk, this doesn't happen. child processes exit gracefully when i disconnect from the room relevant code in thread but i have repro'd with the code in voice ai quickstart
    • 1
    • 1
  • s

    square-waiter-62432

    11/15/2025, 2:51 PM
    I am trying to use the turn detection on azure stt .. does it work ?
    👍 1