rhythmic-jordan-68259
08/08/2025, 12:40 PMtall-airline-16756
08/08/2025, 12:40 PMrhythmic-jordan-68259
08/08/2025, 2:18 PMtall-airline-16756
08/08/2025, 2:18 PMtranscription_node
method in your custom agent class. This way, you can process and tidy up the text in real-time. Here’s a simple example to get you started:
from livekit.agents import Agent, ModelSettings
from typing import AsyncIterable
class CustomAgent(Agent):
async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
"""Clean up the transcribed text by replacing certain words and removing markdown"""
replacements = {
"example_word": "replacement",
# Feel free to add more replacements here
}
async def process_text():
async for chunk in text:
modified_chunk = chunk
# Go through each replacement and apply it
for original, replacement in replacements.items():
modified_chunk = modified_chunk.replace(original, replacement)
yield modified_chunk
return process_text()
Also, if you want to easily strip out markdown formatting, you can use the handy built-in filter_markdown
utility like this:
from livekit.agents.voice.transcription.filters import filter_markdown
class CustomAgent(Agent):
async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
"""Clean up the transcribed text by removing markdown and other unwanted characters"""
return filter_markdown(text)
If you have any questions or want me to walk you through more examples, just let me know—I’m here to help!
Sources: livekit-agents/livekit/agents/voice/agent_activity.pyrhythmic-jordan-68259
08/08/2025, 2:20 PMtall-airline-16756
08/08/2025, 2:20 PMLLMAdapter
with LangGraph, cleaning text during streaming can be done in a few friendly ways. Here’s how you can do it:
1. Using `transcription_node`: You can override the transcription_node
in your agent to tidy up the text as it streams. Here’s a quick example:
from livekit.agents import Agent, ModelSettings
from typing import AsyncIterable
class MyAgent(Agent):
async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
async for delta in text:
cleaned_delta = delta.replace("😘", "") # Just an example: removing certain characters
yield cleaned_delta
1. Using `before_tts_cb`: Another great way is to use a callback that cleans the text right before it goes to the TTS engine. Check this out:
from livekit.agents import VoicePipelineAgent
from typing import AsyncIterable
async def _before_tts_callback(agent: VoicePipelineAgent, text: str | AsyncIterable[str]) -> str | AsyncIterable[str]:
if isinstance(text, str):
return text.replace("start", "").replace("end", "")
else:
async def process_stream():
async for chunk in text:
yield chunk.replace("start", "").replace("end", "")
return process_stream()
1. Using LangGraph Workflow: You can also tweak your LangGraph workflow to add text cleaning steps. Here’s a simple example:
from langgraph.graph import StateGraph
from livekit.plugins import langchain
def create_workflow():
workflow = StateGraph(...)
# Add nodes to clean text (like removing special characters)
return workflow.compile()
session = AgentSession(
llm=langchain.LLMAdapter(
graph=create_workflow()
),
# ... other configurations ...
)
If you want to dive deeper or have any questions about these methods, just let me know—I’m here to help!
Sources: Page not found | LiveKit Docs | Pipeline nodes and hooks | LiveKit Docs | chain-of-thought-tts/README.mdocrhythmic-jordan-68259
08/08/2025, 2:23 PMtall-airline-16756
08/08/2025, 2:23 PMbefore_tts_cb
in LiveKit is pretty flexible — it can handle both the entire text at once or in smaller chunks. Here’s how it works:
• If you give it a string, it processes the whole text in one go.
• If you pass an AsyncIterable[str]
, it works through the text bit by bit, which is great for making real-time tweaks during the text-to-speech process.
This way, it’s compatible with both non-streaming and streaming text scenarios. If you want me to explain anything further, just let me know—I’m here to help!rhythmic-jordan-68259
08/08/2025, 2:27 PM- Aug 15 Monday 10AM- 11AM
What is the best way to remove - from the start but not from 10-11tall-airline-16756
08/08/2025, 2:28 PM-
) just from the start of your text but keep the ones in places like 10AM-11AM
, you can easily do that with a little string manipulation. Here’s a quick example in Python:
text = "- Aug 15 Monday 10AM-11AM"
if text.startswith("-"):
text = text[1:].lstrip() # This takes off the dash at the beginning and any extra spaces
print(text) # You’ll get: "Aug 15 Monday 10AM-11AM"
This way, only the dash at the very start gets removed and the rest stay just as they are. If you want me to help with anything else, just let me know—I’m here for you!rhythmic-jordan-68259
08/08/2025, 2:36 PMtall-airline-16756
08/08/2025, 2:37 PMrhythmic-jordan-68259
08/08/2025, 2:38 PMtall-airline-16756
08/08/2025, 2:38 PMrhythmic-jordan-68259
08/08/2025, 2:41 PMtall-airline-16756
08/08/2025, 2:42 PM