Is there an easy way to have a TTS on an agent tha...
# ask-ai
w
Is there an easy way to have a TTS on an agent that is only triggered when I call session.say, and is NOT triggered when the LLM gives an output?
t
Thanks for reaching out, Drew. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
To configure Text-to-Speech (TTS) on an agent so that it triggers only with
session.say
and not with LLM output, you can follow these steps: 1. Use the
session.say()
Method Directly:
Explicitly call
session.say("message")
within your agent's logic to trigger TTS for specific messages. This gives you full control over when the TTS is activated. 2. Disable LLM Output to TTS: Avoid using methods like
generate_reply()
that automatically send LLM output to TTS. Instead, process the LLM output and use
session.say()
only for the parts you want spoken. 3. Custom LLM Output Processing: Override the
llm_node
method in your agent to intercept and modify the LLM output before it reaches the TTS engine. This allows you to control what text is sent to TTS and what is not. 4. Implement a TTS Callback: Use a callback function like
before_tts_cb
to process and filter the text before it is sent to the TTS engine. This way, you can conditionally trigger TTS only for specific text or under certain conditions. These steps will ensure that TTS is only triggered through explicit
session.say()
calls and not through the LLM's automatic responses. Sources: Modifying LLM output before TTS | LiveKit Docs | chain-of-thought-tts/README.mdoc | Agent speech and audio | LiveKit Docs
w
Show me where I pass in the LLM node to agent
t
Hey there! If you want to pass an LLM node to an agent in LiveKit, here’s a simple way to do it: 1. Create an LLM Instance:
Copy code
llm_node = openai.LLM(
    model="gpt-3.5-turbo",
    api_key="your-api-key",
    temperature=0.7,
    max_tokens=2000
)
2. Initialize the Agent with the LLM:
Copy code
agent = Agent(
    instructions="Your instructions here",
    llm=llm_node,
    stt=deepgram.STT(),
    tts=openai.TTS(),
    vad=silero.VAD.load()
)
3. Optional: Customize the LLM Node:
Copy code
async def llm_node(
    self,
    chat_ctx: llm.ChatContext,
    tools: list[FunctionTool],
    model_settings: ModelSettings
) -> AsyncIterable[llm.ChatChunk]:
    # Preprocessing logic here
    chunks = Agent.default.llm_node(self, chat_ctx, tools, model_settings)
    # Postprocessing logic here
    return chunks
And that’s it! Following these steps will help you smoothly integrate and tweak the LLM node for your agent. If you have any questions or want to chat more about it, I’m here to help! Sources: examples/voice_agents/llamaindex-rag/retrieval.py | Voice pipeline nodes | LiveKit Docs | BeforeLLMCallback | LiveKit Agents