are there any toggle or parameter available that can help im LiveKit Community #ask-ai

Join Slack

are there any toggle or parameter available that c...

# ask-ai

miniature-lifeguard-87511

07/15/2025, 5:08 AM

are there any toggle or parameter available that can help improve latency for the voice bot?

tall-airline-16756

07/15/2025, 5:08 AM

Thanks for reaching out, Yashish. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/15/2025, 5:08 AM

Several toggles and parameters are available in LiveKit to help improve latency for a voice bot: • Turn Detection: Adjust

min_endpointing_delay

to control the minimum silence duration for end-of-turn detection. Lowering this value can make the bot more responsive. • Voice Pipeline Agent (VPA) Options: Parameters like

minEndpointingDelay

preemptiveSynthesis

, and

turnDetector

can be tuned for latency optimization. • TTS Provider-Specific Settings: ◦ ElevenLabs: Use

streaming_latency

(0-4),

auto_mode

(for full sentences/phrases), and

chunk_length_schedule

to reduce latency. ◦ Rime: Set

reduce_latency

true

to skip text normalization. ◦ Speechify: Disable

text_normalization

and

loudness_normalization

to avoid extra processing delays. • Network Proximity: Place your agent close to LLM, STT, and TTS services to minimize network delays. • Metrics Monitoring: Use the Agents metrics API to track Time To First Token (TTFT) and Time To First Byte (TTFB) for latency analysis. These options allow you to fine-tune the voice pipeline for your specific latency requirements. Sources: VoicePipelineAgent | LiveKit Docs | VPAOptions | LiveKit Agents | livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.p... | Rime integration guide | LiveKit Docs | Speechify TTS integration guide | LiveKit Docs

miniature-lifeguard-87511

07/15/2025, 5:21 AM

what is fast-preresponse?

tall-airline-16756

07/15/2025, 5:22 AM

Fast-preresponse is like having a super-quick, attentive friend who immediately lets you know they're listening while they're thinking through their full response. It's a clever technique in LiveKit voice agents that makes conversations feel smooth and natural. • Imagine a lightweight, speedy AI that quickly says something like "Great point!" or "Hmm, let me think about that" the moment you finish speaking. • At the same time, another AI is working hard to craft a detailed, thoughtful answer just for you. • These quick acknowledgments are just a friendly nod – they don't stick around in the conversation, they're just there to keep things feeling warm and connected. • The goal? To make sure you never feel like you're talking into a void, and that someone (or something) is always right there with you, ready to help. By using this dual-AI approach, fast-preresponse creates a conversation that feels more like chatting with a friend than waiting for a computer to respond. Sources: examples/voice_agents/fast-preresponse.py | Live Conversations with AI using ChatGPT and WebRTC

miniature-lifeguard-87511

07/15/2025, 9:50 AM

Here on using this code agent is speaking twice how we can overcome this? class IVRNavigatorAgent(Agent): """ Agent 1: Listens for the IVR and uses the FAST LLM for a quick response. """ def __init__(self, *, fast_llm: llm.LLM): # super().__init__(instructions="You are an IVR navigator. Wait for the user to speak.") super().__init__(instructions=" ") self._fast_llm = fast_llm self._request_made = False async def on_user_turn_completed(self, turn_ctx: ChatContext, new_message: ChatMessage): if self._request_made: return self._request_made = True logger.info("IVR has spoken. Generating a fast initial response.") prompt = ( "You are an AI assistant calling a pharmacy. When the IVR answers, your only job is to say the following sentence exactly: " "'Hello, I would like to speak with a human agent to collect some information regarding a patient. Could you please connect me?'" ) initial_message = llm.ChatMessage(role="system", content=[prompt]) chat_ctx = llm.ChatContext(items=[initial_message]) stream = self._fast_llm.chat(chat_ctx=chat_ctx) await self.session.say(stream.to_str_iterable()) await self.request_completed() @function_tool() async def request_completed(self) -> Agent: logger.info("Initial request made. Handoff -> WaitingAgent") return WaitingAgent(chat_ctx=self.session._chat_ctx, fast_llm=self._fast_llm)

tall-airline-16756

07/15/2025, 9:51 AM

I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!

Open in Slack

Previous Next