able-jackal-84140
07/04/2025, 7:30 PMtall-airline-16756
07/04/2025, 7:30 PMtall-airline-16756
07/04/2025, 7:30 PMable-jackal-84140
07/04/2025, 7:30 PMtall-airline-16756
07/04/2025, 7:30 PMstt_node()
function - pretty cool, right?
• LLM Node: Then, our smart AI takes that text and thinks up a clever response with the llm_node()
function.
• TTS Node: Finally, the AI's response gets turned back into speech using the tts_node()
function.
Here's how it all comes together behind the scenes:
1. The system shifts into "thinking mode" - just like when you're pondering a great idea!
2. It runs the AI inference by calling perform_llm_inference()
with the agent's llm_node
.
3. The AI output gets split into two streams using tee
.
4. If audio is enabled, it converts the text to speech by calling perform_tts_inference()
with the agent's tts_node
.
5. The system waits for a green light before sharing the response.
The best part? This setup is super flexible, so developers can tweak things at any point in the process. How awesome is that? Sources: Voice pipeline nodes | LiveKit Docs | GitHub::livekit::agents | livekit-agents/livekit/agents/voice/agent_activit...