clever-doctor-15083
03/28/2025, 9:18 AMtall-airline-16756
03/28/2025, 9:18 AMtall-airline-16756
03/28/2025, 9:19 AMbefore_tts_cb
callback to modify the LLM output before it is sent to the TTS engine. This allows you to remove or modify specific tokens like <think>
and </think>
that represent the agent's chain-of-thought reasoning.
Here is an example implementation:
async def _before_tts_cb(agent: VoicePipelineAgent, text: str | AsyncIterable[str]):
if isinstance(text, str):
# Handle non-streaming text
result = text.replace("<think>", "").replace("</think>", "")
return result
else:
# Handle streaming text
async def process_stream():
async for chunk in text:
processed = chunk.replace("<think>", "")\
.replace("</think>", "Okay, I'm ready to respond.")
yield processed
return process_stream()
This callback intercepts the text generated by the LLM before it is sent to the TTS engine, removing the <think>
and </think>
tokens so they are not vocalized. This effectively skips the thinking process in the spoken output while preserving the agent's reasoning capabilities.
Sources: Modifying LLM output before TTS | LiveKit Docsclever-doctor-15083
03/28/2025, 9:19 AMtall-airline-16756
03/28/2025, 9:20 AMasync def _before_tts_cb(agent: VoicePipelineAgent, text: str | AsyncIterable[str]):
if isinstance(text, str):
# Handle non-streaming text
result = text.replace("<think>", "").replace("</think>", "")
return result
else:
# Handle streaming text
async def process_stream():
async for chunk in text:
processed = chunk.replace("<think>", "")\
.replace("</think>", "Okay, I'm ready to respond.")
yield processed
return process_stream()
This Python code defines a callback function _before_tts_cb
that intercepts the text generated by the LLM before it is sent to the TTS engine. It removes the <think>
and </think>
tokens, ensuring that the thinking process is skipped in the spoken output.clever-doctor-15083
03/28/2025, 9:24 AMtall-airline-16756
03/28/2025, 9:24 AMparticipant_attributes_changed
can directly transition to "speaking" from "listening" in a VoicePipeline agent. This might require further investigation or clarification from a specialist.
If you have additional details or context about your use case, feel free to share, and I can assist further or escalate the issue to a human specialist if needed.