prehistoric-printer-73777
08/05/2025, 6:56 PMtall-airline-16756
08/05/2025, 6:56 PMtall-airline-16756
08/05/2025, 6:56 PMllm_node hook to intercept and modify inputs before they reach the TTS stage. This allows for customization such as replacing specific text, cleaning up responses, or altering the content as needed.
You can override the llm_node method in your custom agent to process the LLM output before it is passed to the TTS engine. For example, this can be used to remove unwanted tags or adjust the text for better pronunciation.
Sources: Modifying LLM output before TTS | LiveKit Docs | pipeline-llm/replacing_llm_output.py | Voice pipeline nodes | LiveKit Docs | Pipeline nodes and hooks | LiveKit Docsprehistoric-printer-73777
08/05/2025, 7:08 PMtall-airline-16756
08/05/2025, 7:08 PMllm_node in your agent. You actually don’t need the tts_node here since we’re making the change before the TTS step. Here’s a straightforward example to help you out:
from livekit.agents import Agent
from livekit.plugins import openai, deepgram, silero
class MyAgent(Agent):
def __init__(self):
super().__init__(
instructions="You are a helpful assistant.",
stt=deepgram.STT(),
llm=openai.LLM(),
tts=openai.TTS(),
vad=silero.VAD.load()
)
async def llm_node(self, chat_ctx, tools, model_settings=None):
async def process_stream():
async with self.llm.chat(chat_ctx=chat_ctx, tools=tools, tool_choice=None) as stream:
async for chunk in stream:
if chunk is None:
continue
# Grab the content from the chunk
content = getattr(chunk.delta, 'content', None) if hasattr(chunk, 'delta') else str(chunk)
if content is None:
yield chunk
continue
# Swap out the original text with your predefined message
processed_content = "This is a predefined response."
if processed_content != content:
if hasattr(chunk, 'delta') and hasattr(chunk.delta, 'content'):
chunk.delta.content = processed_content
else:
chunk = processed_content
yield chunk
return process_stream()
async def on_enter(self):
await self.session.generate_reply()
So, what’s happening here?
• The llm_node catches the LLM’s output.
• The process_stream function swaps the original content with your predefined response.
• Then, this updated text gets sent to the TTS engine.
This way, you’re sure the original LLM response gets replaced before it ever reaches the TTS. Hope this helps!