https://livekit.io logo
Join Slack
Powered by
# ask-ai
  • s

    strong-pizza-31661

    08/03/2025, 12:54 PM
    why is interruption handling not working for this?? model = google.beta.realtime.RealtimeModel( _api_key_=os.getenv("GEMINI_API_KEY"), model="gemini-live-2.5-flash-preview", #modalities=["text"], language="tr-TR", temperature=0.8, _#input_audio_transcription=None,_ _#realtime_input_config=rt_cfg, # <- typed object with the attr LK needs_ ) # Create AgentSession with the multimodal model session = AgentSession( llm=model, #tts=elevenlabs.TTS( _# language=LANG_MODE,_ _# voice_id="cgSgspJ2msm6clMCkdW9",_ _# streaming_latency=3,_ #), vad=silero.VAD.load(), stt = deepgram.STT( model="nova-2-general", language=LANG_MODE, _sample_rate_=8000, # critical ), )
    t
    • 2
    • 2
  • b

    boundless-truck-87206

    08/03/2025, 1:09 PM
    RuntimeError: The STT (livekit.plugins.openai.stt.STT) does not support streaming, add a VAD to the AgentTask/VoiceAgent to enable streamingOr manually wrap your STT in a stt.StreamAdapter {"pid": 38555, "job_id": "AJ_JhL5qz3T5mKA"} How do I manually wrap my STT in a stt.StreamAdapter ? I don't want to add a VAD
    t
    • 2
    • 2
  • m

    mysterious-agent-76276

    08/03/2025, 1:59 PM
    is there a way to use the OpenAI RealtimeModel for STT + LLM, while having a separate model handle the TTS part? Based on my testing, if I set modalities to
    ["text"+"audio"]
    it overrides my TTS model with the OpenAI model.
    t
    • 2
    • 4
  • b

    brief-vase-33757

    08/03/2025, 3:17 PM
    sample ecs task definition for agents
    t
    • 2
    • 4
  • a

    ambitious-ice-35806

    08/03/2025, 3:43 PM
    I don't know Sometimes Mine code Works Perfectly & some time the Same code is giving me Error I have tried to install the pervious version of all the packages but i didn't work:
    Copy code
    from dotenv import load_dotenv
    from livekit import agents
    from livekit.agents import AgentSession, Agent, RoomInputOptions
    from livekit.plugins import elevenlabs,google,deepgram,noise_cancellation,silero
    from livekit.plugins.turn_detector.multilingual import MultilingualModel
    from Config import saveQualifiedLead,disqualifyLead,CallbackTime
    from Prompts import SYSTEM_PROMPT
    load_dotenv()
    
    
    class Assistant(Agent):
        def __init__(self) -> None:
            super().__init__(instructions=SYSTEM_PROMPT,
                            stt=deepgram.STT(model="nova-2-phonecall"),
                            llm=google.LLM(model="gemini-2.5-flash-preview-05-20"),
                            tts=elevenlabs.TTS(model="eleven_turbo_v2_5",voice_id="UgBBYS2sOqTuMpoF3BR0"),
                            vad=silero.VAD.load(),
                            turn_detection=MultilingualModel(),
                            tools=[saveQualifiedLead,disqualifyLead,CallbackTime]
                            )
    
    
    async def entrypoint(ctx: agents.JobContext):
        
        
        session = AgentSession(allow_interruptions=False)
    
        await session.start(
            room=ctx.room,
            agent=Assistant(),
            room_input_options=RoomInputOptions(
                audio_enabled=True,
                video_enabled=False,
                noise_cancellation=noise_cancellation.BVC(), 
            ),
        )
        
        await session.generate_reply( instructions="Simply Say 'Hello!! How are you Doing!!' ")
    
        
    
    
    if __name__ == "__main__":
        agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint,agent_name="Umer"))
    ❌ Error:
    Copy code
    2025-08-03 20:39:17,262 - INFO livekit.agents - initializing process {"pid": 3276, "inference": true}
    2025-08-03 20:39:20,104 - DEBUG livekit.agents - initializing inference runner {"runner": "lk_end_of_utterance_multilingual", "pid": 3276, "inference": true}
    2025-08-03 20:39:27,270 - INFO livekit.agents - killing process {"pid": 3276, "inference": true}
    2025-08-03 20:39:27,270 - ERROR livekit.agents - worker failed 
    Traceback (most recent call last):
      File "C:\Python313\Lib\asyncio\tasks.py", line 507, in wait_for
        return await fut
               ^^^^^^^^^
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\ipc\channel.py", line 47, in arecv_message
        return _read_message(await dplx.recv_bytes(), messages)
                             ^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\utils\aio\duplex_unix.py", line 35, in recv_bytes
        len_bytes = await self._reader.readexactly(4)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Python313\Lib\asyncio\streams.py", line 769, in readexactly
        await self._wait_for_data('readexactly')
      File "C:\Python313\Lib\asyncio\streams.py", line 539, in _wait_for_data
        await self._waiter
    asyncio.exceptions.CancelledError
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\cli\_run.py", line 79, in _worker_run
        await worker.run()
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\worker.py", line 387, in run
        await self._inference_executor.initialize()
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\ipc\supervised_proc.py", line 169, in initialize
        init_res = await asyncio.wait_for(
                   ^^^^^^^^^^^^^^^^^^^^^^^
        ...<2 lines>...
        )
        ^
      File "C:\Python313\Lib\asyncio\tasks.py", line 506, in wait_for
        async with timeouts.timeout(timeout):
                   ~~~~~~~~~~~~~~~~^^^^^^^^^
      File "C:\Python313\Lib\asyncio\timeouts.py", line 116, in __aexit__
        raise TimeoutError from exc_val
    TimeoutError
    2025-08-03 20:39:27,277 - ERROR livekit.agents - Error in _read_ipc_task 
    Traceback (most recent call last):
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\utils\aio\duplex_unix.py", line 35, in recv_bytes
        len_bytes = await self._reader.readexactly(4)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Python313\Lib\asyncio\streams.py", line 767, in readexactly
        raise exceptions.IncompleteReadError(incomplete, n)
    asyncio.exceptions.IncompleteReadError: 0 bytes read on a total of 4 expected bytes
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\utils\log.py", line 16, in async_fn_logs
        return await fn(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\cli\watcher.py", line 120, in _read_ipc_task
        msg = await channel.arecv_message(self._pch, proto.IPC_MESSAGES)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\ipc\channel.py", line 47, in arecv_message
        return _read_message(await dplx.recv_bytes(), messages)
                             ^^^^^^^^^^^^^^^^^^^^^^^
      File "D:\livekit-Agent-development\.venv\Lib\site-packages\livekit\agents\utils\aio\duplex_unix.py", line 43, in recv_bytes
        raise DuplexClosed() from e
    livekit.agents.utils.aio.duplex_unix.DuplexClosed
    t
    • 2
    • 4
  • b

    boundless-truck-87206

    08/03/2025, 4:45 PM
    How do I manually add messages to the chat from an rpc_method?
    t
    • 2
    • 6
  • c

    chilly-motorcycle-32290

    08/03/2025, 4:58 PM
    How to receive messages from backend agents on my react app. Currently I’m using use chat and it’s only helping in sending message. But I’m not able to receive the output from livekit backend.
    t
    • 2
    • 2
  • v

    victorious-summer-64766

    08/03/2025, 5:03 PM
    Hey iam facing issue while retirving the audio chucnks from live kit,could somebody assist me
    t
    • 2
    • 6
  • b

    boundless-truck-87206

    08/03/2025, 5:28 PM
    What does the ConversationItemAddedEvent event do?
    t
    • 2
    • 2
  • m

    many-hair-70963

    08/03/2025, 5:38 PM
    github for livekit hedra integration
    t
    • 2
    • 2
  • b

    boundless-truck-87206

    08/03/2025, 5:55 PM
    In the livekit framework who calls the llm_node function of the Agent class passing it the chat_ctx?
    t
    • 2
    • 2
  • m

    many-fall-81099

    08/03/2025, 9:03 PM
    have any reference code where there are multiple participants and a single agent in the room. Basically confused where Im creating a live authtoken from my python backend, agent connects. I need to manually connect one more remote participant to the same room. Im using meet.livekit to visualise the room stuffs.
    t
    • 2
    • 6
  • s

    strong-pizza-31661

    08/03/2025, 9:19 PM
    livekit bug: why is the ai not able to understnad interruptions? this error seems exlucsibe to gemini realtime. code:from dotenv import load_dotenv from pathlib import Path from livekit import agents from livekit.agents.voice import AgentSession, Agent from livekit.plugins import ( openai, silero, google, elevenlabs, deepgram ) # Additional imports for Gemini realtime configuration from google.genai import types as genai_types # Language mode for TTS/STT components ("tr" for Turkish, "en" for English, etc.) LANG_MODE = "tr" # Realtime input configuration for the Gemini Live model. # This mirrors the configuration used in
    voiceAgentMultiLogging.py
    so that # automatic activity detection and related sensitivities are aligned across # our agents. rt_cfg = genai_types.RealtimeInputConfig( automatic_activity_detection={ "disabled": False, "start_of_speech_sensitivity": "START_SENSITIVITY_LOW", "end_of_speech_sensitivity": "END_SENSITIVITY_LOW", "prefix_padding_ms": 20, "silence_duration_ms": 100, } ) import os load_dotenv() # Load Pronet agent prompt (English) PRONET_PROMPT = Path("pronet_agent_prompt_en.txt").read_text(encoding="utf-8") class Assistant(Agent): def __init__(self) -> None: super().__init__(instructions=PRONET_PROMPT) async def entrypoint(ctx: agents.JobContext): await ctx.connect() model = google.beta.realtime.RealtimeModel( api_key=os.getenv("GEMINI_API_KEY"), model="gemini-live-2.5-flash-preview", modalities=["text"], temperature=0.8, #input_audio_transcription=None, realtime_input_config=rt_cfg, # <- typed object with the attr LK needs ) # Create AgentSession with the multimodal model session = AgentSession( llm=model, tts=elevenlabs.TTS( voice_id="cgSgspJ2msm6clMCkdW9", language="en", streaming_latency=3, ), ) # -------------------------------------------------- # DATA COLLECTION: track conversation items # -------------------------------------------------- chat_history: list[dict] = [] @session.on("conversation_item_added") def _on_item(ev): if hasattr(ev.item, "role"): role = ev.item.role text_content = getattr(ev.item, "text_content", "") or "" chat_history.append({"role": role, "text": text_content}) # Instantiate your Assistant agent and start the voice session agent_instance = Assistant() await session.start( room=ctx.room, agent=agent_instance, ) # -------------------------------------------------- # OUTPUT COLLECTED DATA # -------------------------------------------------- print("\n=== Conversation Transcript ===") for turn in chat_history: print(f"{turn['role']}: {turn['text']}") print("=== End of Transcript ===\n") if name == "__main__": agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))
    t
    • 2
    • 6
  • h

    helpful-salesclerk-77327

    08/04/2025, 12:48 AM
    explain to me the difference between using lang graph as the llm versus the session.start agent
    t
    • 2
    • 8
  • n

    nutritious-policeman-86688

    08/04/2025, 1:34 AM
    how to write tool function in python that hangs up the call ?
    t
    • 2
    • 6
  • a

    abundant-magician-17307

    08/04/2025, 1:39 AM
    livekit和coturn怎么组合使用?部署不同环境吗
    t
    • 2
    • 2
  • k

    kind-iron-94532

    08/04/2025, 3:17 AM
    how can i disconnect an sip call after a fixed max call duration. how can i provide max call duration to agent?
    t
    • 2
    • 6
  • a

    ancient-hospital-67205

    08/04/2025, 3:29 AM
    ModuleNotFoundError: No module named 'livekit.agents.testing'
    t
    • 2
    • 2
  • p

    polite-oil-10264

    08/04/2025, 3:30 AM
    Can I control the TTS to not speak certain words, like skipping the input when it contains numbers and only speaking the text?
    t
    • 2
    • 12
  • h

    helpful-salesclerk-77327

    08/04/2025, 4:23 AM
    how can I set up the livekit agent to only respond when specifically called upon
    t
    • 2
    • 2
  • c

    calm-train-17221

    08/04/2025, 5:14 AM
    Is there a way to ask for correction based on the confidence of the STT transcription? Is there even a way to get the confidence value of the STT transcription? I'd like for the agent to ask the user to repeat what they said when the confidence is low enough.
    t
    • 2
    • 4
  • h

    helpful-salesclerk-77327

    08/04/2025, 5:44 AM
    is it possible to have the livekit agent say something if there has been a long pause
    t
    • 2
    • 4
  • h

    helpful-salesclerk-77327

    08/04/2025, 5:48 AM
    is it possible to have an ai agent hang up from the console
    t
    • 2
    • 2
  • n

    nice-fish-21757

    08/04/2025, 5:49 AM
    In livekit agents, how to send reason param in shutdown callback?
    t
    • 2
    • 4
  • a

    abundant-magician-17307

    08/04/2025, 5:58 AM
    turn_servers: - host: <政务外网TURN服务器IP> port: 3478 protocol: udp username: turnuser credential: secret2024 livekit配置的coturn服务,它自己会去建立连接吗?还是仅客户端使用?
    t
    • 2
    • 2
  • f

    future-stone-69754

    08/04/2025, 6:12 AM
    How we get ad token? Overview Azure OpenAI provides OpenAI services hosted on Azure. With LiveKit's Azure OpenAI TTS integration and the Agents framework, you can build voice AI applications that sound realistic and natural. To learn more about TTS and generating agent speech, see Agent speech. Quick reference This section includes a basic usage example and some reference material. For links to more detailed documentation, see Additional resources. Installation Support for Azure OpenAI TTS is available in the
    openai
    plugin. Install the plugin from PyPI:
    Copy code
    pip install "livekit-agents[openai]~=1.0"
    Authentication The Azure OpenAI TTS requires authentication using an API key or a managed identity. Set the following environment variables in your
    .env
    file:
    Copy code
    AZURE_OPENAI_API_KEY=<azure-openai-api-key>
    AZURE_OPENAI_AD_TOKEN=<azure-openai-ad-token>
    AZURE_OPENAI_ENDPOINT=<azure-openai-endpoint>
    Usage Use Azure OpenAI TTS within an
    AgentSession
    or as a standalone speech generator. For example, you can use this TTS in the Voice AI quickstart.
    Copy code
    from livekit.plugins import openai
    
    session = AgentSession(
       tts=openai.TTS.with_azure(
          model="gpt-4o-mini-tts",
          voice="coral",
       )
       # ... llm, stt, etc.
    )
    t
    • 2
    • 4
  • a

    adamant-sandwich-74490

    08/04/2025, 7:03 AM
    Is it possible, when passing metrics to Langfuse, to mark not only `llm_request`'s as an interaction with an AI model, but also other nodes such as
    tts_request
    or
    stt_request
    as such? The goal is to easily track TTS and STT model usage and costs from Langfuse's dashboard. Is there anything that can be done from Livekit's end to achieve this?
    t
    • 2
    • 2
  • s

    salmon-motherboard-10077

    08/04/2025, 7:34 AM
    I'm looking for the documentation on creating a modular design for the AI agent setup.
    t
    • 2
    • 2
  • m

    mysterious-agent-76276

    08/04/2025, 7:37 AM
    is there a way to make the agent pass a JSON to
    on_user_turn_completed
    ? the idea is that I want the LLM to always return a JSON, not text. And I want the agent to be able to access the JSON response. tis there a way to make
    t
    • 2
    • 4
  • s

    salmon-motherboard-10077

    08/04/2025, 7:45 AM
    How can I effectively have multiple interview agents created on the same livekit project?
    t
    • 2
    • 4