Is there an easy way to have a TTS on an agent that is only LiveKit Community #ask-ai

Is there an easy way to have a TTS on an agent tha...

wooden-librarian-60399

08/04/2025, 9:53 PM

Is there an easy way to have a TTS on an agent that is only triggered when I call session.say, and is NOT triggered when the LLM gives an output?

tall-airline-16756

08/04/2025, 9:53 PM

Thanks for reaching out, Drew. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

08/04/2025, 9:54 PM

To configure Text-to-Speech (TTS) on an agent so that it triggers only with

session.say

and not with LLM output, you can follow these steps: 1. Use the
session.say()
Method Directly: Explicitly call

session.say("message")

within your agent's logic to trigger TTS for specific messages. This gives you full control over when the TTS is activated. 2. Disable LLM Output to TTS: Avoid using methods like

generate_reply()

that automatically send LLM output to TTS. Instead, process the LLM output and use

session.say()

only for the parts you want spoken. 3. Custom LLM Output Processing: Override the

llm_node

method in your agent to intercept and modify the LLM output before it reaches the TTS engine. This allows you to control what text is sent to TTS and what is not. 4. Implement a TTS Callback: Use a callback function like

before_tts_cb

to process and filter the text before it is sent to the TTS engine. This way, you can conditionally trigger TTS only for specific text or under certain conditions. These steps will ensure that TTS is only triggered through explicit

session.say()

calls and not through the LLM's automatic responses. Sources: Modifying LLM output before TTS | LiveKit Docs | chain-of-thought-tts/README.mdoc | Agent speech and audio | LiveKit Docs

wooden-librarian-60399

08/04/2025, 10:02 PM

Show me where I pass in the LLM node to agent

tall-airline-16756

08/04/2025, 10:02 PM

Hey there! If you want to pass an LLM node to an agent in LiveKit, here’s a simple way to do it: 1. Create an LLM Instance:

Copy code

llm_node = openai.LLM(
    model="gpt-3.5-turbo",
    api_key="your-api-key",
    temperature=0.7,
    max_tokens=2000
)

2. Initialize the Agent with the LLM:

Copy code

agent = Agent(
    instructions="Your instructions here",
    llm=llm_node,
    stt=deepgram.STT(),
    tts=openai.TTS(),
    vad=silero.VAD.load()
)

3. Optional: Customize the LLM Node:

Copy code

async def llm_node(
    self,
    chat_ctx: llm.ChatContext,
    tools: list[FunctionTool],
    model_settings: ModelSettings
) -> AsyncIterable[llm.ChatChunk]:
    # Preprocessing logic here
    chunks = Agent.default.llm_node(self, chat_ctx, tools, model_settings)
    # Postprocessing logic here
    return chunks

And that’s it! Following these steps will help you smoothly integrate and tweak the LLM node for your agent. If you have any questions or want to chat more about it, I’m here to help! Sources: examples/voice_agents/llamaindex-rag/retrieval.py | Voice pipeline nodes | LiveKit Docs | BeforeLLMCallback | LiveKit Agents

Open in Slack

Previous Next