I’m working on a voice AI agent using LiveKit, and...
# ask-ai
e
I’m working on a voice AI agent using LiveKit, and I’m facing an issue with a feedback event loop during calls. The problem occurs when the agent speaks (e.g., the greeting), and that output audio is picked up as if it were user input, causing the agent to respond to itself and create a loop. Details: • The agent plays an initial greeting (e.g., “Hi, how can I help you?”). • This audio is then fed back into the STT pipeline as if it were coming from the microphone/user. • This triggers the LLM again, which produces another response, and the cycle repeats. • The loop continues until manually stopped. What I’ve tried / considered: • Checking whether the STT service can be muted or paused while the agent is speaking. • Considering using server-side VAD (voice activity detection) or silence detection to ignore audio during the agent’s output. • Looking into echo cancellation or setting up audio tracks to avoid routing agent output back into input processing. Questions: 1. Does LiveKit have a built-in or recommended way to prevent agent output from being treated as input? 2. Is there an existing feature to programmatically mute/disable the STT pipeline during TTS playback? 3. Are there configuration options for acoustic echo cancellation (AEC) or track separation to avoid this feedback loop?
t
Thanks for reaching out, Vivek. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Hey there! To help you tackle the feedback event loop issue in your voice AI agent, here are some friendly tips you can try out: • Mute STT During TTS Playback: You can use the
set_audio_enabled(False)
method to temporarily pause audio input while your agent is speaking, then turn it back on once it's done. LiveKit's Push-to-Talk Agent example shows this in action. • Track Separation: Try using separate audio tracks for input and output to keep things from mixing together. The EchoTranscriberAgent example is a great reference for creating local audio tracks for output. • Acoustic Echo Cancellation: Turning on noise cancellation and acoustic echo cancellation in your session settings can really help. Just set up the
noise_cancellation
option in your session's input configuration. • Voice Activity Detection (VAD): Adding VAD lets your agent know when to listen and when to speak, which helps avoid processing audio unnecessarily during output. These strategies should make sure your agent’s output doesn’t get mistaken for input, stopping those pesky feedback loops. If you want me to walk you through any of these or share some examples, just let me know—I’m here to help!
e
are there any reference docs for it?
t
Hey there! I’ve put together some handy reference materials and examples to help you avoid feedback loops in your LiveKit voice AI agents: • Muting STT During TTS Playback: A great trick is to use the
set_audio_enabled(False)
method to pause audio input processing while the agent is talking. You can see this in action in the Push-to-Talk Agent example. • Track Separation: Try creating separate audio tracks for input and output to keep things from getting mixed up. The EchoTranscriberAgent example is a solid guide for this. • Acoustic Echo Cancellation: Don’t forget to enable noise cancellation and acoustic echo cancellation in your session settings. Options like BVC (Basic Voice Cancellation) or Krisp can really boost noise suppression. • Voice Activity Detection (VAD): Adding VAD helps manage turn detection and stops unnecessary audio processing when the agent is speaking. Silero VAD is a great plugin to check out for this. • Turn Detection: LiveKit offers turn detection modes (like VAD-based or STT-based) that can help keep the conversation flowing smoothly and prevent feedback loops. If you want, I’d be happy to share some detailed code examples or help you dive deeper into any of these features. Just let me know! Sources: LiveKit Agents integrations | LiveKit Docs | AI voice agents | LiveKit Docs | agents/src/pipeline/pipeline_agent.ts | Speech & Audio Processing | livekit/livekit_composite | DeepWiki | livekit-agents/livekit/agents/voice/agent_session.py