Hi dear I am facing latency issues with my livekit calling a LiveKit Community #ask-ai

Hi dear, I am facing latency issues with my liveki...

hallowed-jordan-87556

08/05/2025, 7:36 AM

Hi dear, I am facing latency issues with my livekit calling agent, we have also purchased your premium and mailed your team for support also, when agent starts it speaks first message and when I do first conversation it taking delays and then speaks and after it conversation latency is 4 seconds, I am using perfect models like using LLM from qroq, and tts and stt from deepgram and still having latency issues, Can anyone guide me? I have check with metrics event and in case of calling agent it shows that LLM and turn detection and tts takes more time how to make it better like latency of 1 or 1.5 seconds, is it possible? can any expert guide me I am using these versions

Copy code

livekit==1.0.6
pinecone==6.0.2
python-dotenv==1.1.0
livekit-agents==1.0.19
langchain_openai==0.3.16
livekit-plugins-openai==1.0.17
livekit-plugins-silero==1.0.17
livekit-plugins-deepgram==1.0.17
livekit-plugins-turn-detector==1.0.19
livekit-plugins-aws==1.0.19
livekit-plugins-noise_cancellation==0.2.1

tall-airline-16756

08/05/2025, 7:36 AM

Thanks for reaching out, Haris. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

08/05/2025, 7:36 AM

To reduce latency in your LiveKit calling agent, consider the following steps: • Optimize Network Proximity: Ensure your agent is close to the LLM, STT, and TTS services to minimize network latency. • Monitor Latency Metrics: Use the Agents metrics API to track key metrics like Time To First Token (TTFT) for LLM and Time To First Byte (TTFB) for TTS. • Provider Selection: Choose low-latency providers for LLM, STT, and TTS services. Use streaming APIs if available. • Enable Preemptive Generation: Start LLM and TTS processing before the user finishes speaking to reduce delays. • Optimize End-of-Utterance Detection: Fine-tune Voice Activity Detection (VAD) and endpointing delays to minimize post-speech latency. • Buffering and Parallel Processing: Use pre-connect buffering and process audio and text streams in parallel. You can also test different configurations and providers to find the optimal setup for your use case. Let me know if you need further assistance with implementing these steps. Sources: Speech & Audio Processing | livekit/livekit_composite | DeepWiki | livekit-agents/livekit/agents/voice/agent_session.py | README.md | Voice Assistants | livekit/livekit_composite | DeepWiki | How can I reduce latency in voice agents using STT, TTS and LLM?

Open in Slack

Previous Next