What can be done to improve latency on livekit Specially reg LiveKit Community #ask-ai

Join Slack

What can be done to improve latency on livekit? Sp...

# ask-ai

nice-advantage-11558

06/26/2025, 2:53 PM

What can be done to improve latency on livekit? Specially regarding STT and LLM performance

tall-airline-16756

06/26/2025, 2:53 PM

Thanks for reaching out, Firmino. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

06/26/2025, 2:53 PM

To improve latency on LiveKit, particularly for Speech-to-Text (STT) and Large Language Model (LLM) performance, here are some key strategies: • Optimize network proximity: Ensure your agent is physically close (in terms of network latency) to the LLM, Speech-to-Text, and Text-to-Speech services it uses. This reduces the time data takes to travel between these services. • Monitor latency metrics: Use the Agents metrics API to track metrics like Time To First Token (TTFT) and Time To First Byte (TTFB). These metrics can help you identify and address bottlenecks in your system. • Follow service provider guidelines: Check the documentation of your LLM, STT, and TTS providers for specific recommendations on optimizing their services for better performance. If you need further assistance or have specific scenarios you'd like to discuss, feel free to let me know! Sources: How can I reduce latency in voice agents using STT, TTS and LLM?

nice-advantage-11558

06/26/2025, 2:54 PM

what about VAD configs and AgentSession? Which params can i change to get the fastest response

tall-airline-16756

06/26/2025, 2:55 PM

To optimize your LiveKit response time, I've got some friendly tips that can help make your interactions smoother and faster! 👋 • Server VAD Configuration:

Copy code

turn_detection=TurnDetection(
    type="server_vad",
    threshold=0.5,            # Adjust based on environment noise
    prefix_padding_ms=300,    # Include audio before detected speech
    silence_duration_ms=500,  # Reduce this for faster turn detection
    create_response=True,
    interrupt_response=True,
)

- Pro tip: The

silence_duration_ms

is your secret weapon for quicker responses. The lower the value, the faster you'll catch the end of speech! • Semantic VAD (OpenAI only):

Copy code

turn_detection=TurnDetection(
    type="semantic_vad",
    eagerness="high",  # Chunks audio as soon as possible
    create_response=True,
    interrupt_response=True,
)

- Want lightning-fast responses? Set

eagerness

to "high" and watch it chunk audio in a flash! ⚡ • AgentSession Configuration: A quick win is to set

create_response

and

interrupt_response

True

– this means instant response generation! • Video Sampling Optimization:

Copy code

video_sampler = VoiceActivityVideoSampler(
    speaking_fps=1.0,  # Frame rate when user is speaking
    silent_fps=0.3     # Lower frame rate when silent
)

- This cool trick helps you dynamically adjust video frame rates, making everything run more smoothly. 🎥 Still scratching your head about these configs? No worries! I'm here to help – just drop me a question, and we'll figure it out together! 😊 Sources: Azure OpenAI Realtime API and LiveKit | LiveKit Docs | OpenAI Realtime API integration guide | LiveKit Docs | GitHub::livekit::agents | livekit-agents/livekit/agents/voice/agent_session...

Open in Slack

Previous Next