When I am streaming where I can clean the text Pr...
# ask-ai
r
When I am streaming where I can clean the text Probably a paragraph
t
Thanks for reaching out, Umair. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
r
Please
t
Hey there! If you want to clean up text during streaming using the transcription node in LiveKit, you can easily override the
transcription_node
method in your custom agent class. This way, you can process and tidy up the text in real-time. Here’s a simple example to get you started:
Copy code
from livekit.agents import Agent, ModelSettings
from typing import AsyncIterable

class CustomAgent(Agent):
    async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
        """Clean up the transcribed text by replacing certain words and removing markdown"""
        replacements = {
            "example_word": "replacement",
            # Feel free to add more replacements here
        }

        async def process_text():
            async for chunk in text:
                modified_chunk = chunk
                # Go through each replacement and apply it
                for original, replacement in replacements.items():
                    modified_chunk = modified_chunk.replace(original, replacement)
                yield modified_chunk

        return process_text()
Also, if you want to easily strip out markdown formatting, you can use the handy built-in
filter_markdown
utility like this:
Copy code
from livekit.agents.voice.transcription.filters import filter_markdown

class CustomAgent(Agent):
    async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
        """Clean up the transcribed text by removing markdown and other unwanted characters"""
        return filter_markdown(text)
If you have any questions or want me to walk you through more examples, just let me know—I’m here to help! Sources: livekit-agents/livekit/agents/voice/agent_activity.py
r
If i am using llmAdapter with langgraph then what i can do?
t
Hey there! When you're using
LLMAdapter
with LangGraph, cleaning text during streaming can be done in a few friendly ways. Here’s how you can do it: 1. Using `transcription_node`: You can override the
transcription_node
in your agent to tidy up the text as it streams. Here’s a quick example:
Copy code
from livekit.agents import Agent, ModelSettings
from typing import AsyncIterable

class MyAgent(Agent):
    async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
        async for delta in text:
            cleaned_delta = delta.replace("😘", "")  # Just an example: removing certain characters
            yield cleaned_delta
1. Using `before_tts_cb`: Another great way is to use a callback that cleans the text right before it goes to the TTS engine. Check this out:
Copy code
from livekit.agents import VoicePipelineAgent
from typing import AsyncIterable

async def _before_tts_callback(agent: VoicePipelineAgent, text: str | AsyncIterable[str]) -> str | AsyncIterable[str]:
    if isinstance(text, str):
        return text.replace("start", "").replace("end", "")
    else:
        async def process_stream():
            async for chunk in text:
                yield chunk.replace("start", "").replace("end", "")
        return process_stream()
1. Using LangGraph Workflow: You can also tweak your LangGraph workflow to add text cleaning steps. Here’s a simple example:
Copy code
from langgraph.graph import StateGraph
from livekit.plugins import langchain

def create_workflow():
    workflow = StateGraph(...)
    # Add nodes to clean text (like removing special characters)
    return workflow.compile()

session = AgentSession(
    llm=langchain.LLMAdapter(
        graph=create_workflow()
    ),
    # ... other configurations ...
)
If you want to dive deeper or have any questions about these methods, just let me know—I’m here to help! Sources: Page not found | LiveKit Docs | Pipeline nodes and hooks | LiveKit Docs | chain-of-thought-tts/README.mdoc
r
So in before_tts cb are we getting whole text or chunks?
t
Hey there! Just wanted to share that the
before_tts_cb
in LiveKit is pretty flexible — it can handle both the entire text at once or in smaller chunks. Here’s how it works: • If you give it a string, it processes the whole text in one go. • If you pass an
AsyncIterable[str]
, it works through the text bit by bit, which is great for making real-time tweaks during the text-to-speech process. This way, it’s compatible with both non-streaming and streaming text scenarios. If you want me to explain anything further, just let me know—I’m here to help!
r
If i have text like
- Aug 15 Monday 10AM- 11AM
What is the best way to remove - from the start but not from 10-11
t
Hey there! If you want to remove the dash (
-
) just from the start of your text but keep the ones in places like
10AM-11AM
, you can easily do that with a little string manipulation. Here’s a quick example in Python:
Copy code
text = "- Aug 15 Monday 10AM-11AM"
if text.startswith("-"):
    text = text[1:].lstrip()  # This takes off the dash at the beginning and any extra spaces
print(text)  # You’ll get: "Aug 15 Monday 10AM-11AM"
This way, only the dash at the very start gets removed and the rest stay just as they are. If you want me to help with anything else, just let me know—I’m here for you!
r
.
t
Feel free to reach out if you have any more questions or need a hand with anything. I'm here for you! Sources: Getting Help | livekit/livekit_composite | DeepWiki | Slack Etiquette | Community & Support | livekit/livekit_composite | DeepWiki
r
What is the diff between tts node and transcript node
t
Hey there! Just wanted to help you understand the difference between the TTS (Text-to-Speech) node and the transcription node in LiveKit—they each play their own special role in the process: • TTS Node: - This one takes your text and turns it into audio using the TTS engine you’ve chosen. - It’s all about speech synthesis, so it transforms written words into spoken ones. - You can tweak it with things like text chunking, custom TTS engines, or audio processing. - It works after the LLM generates the text, creating the final audio you hear. • Transcription Node: - This node focuses on polishing up the text before it goes anywhere. - It helps clean up the transcription—fixing formatting, removing any unwanted characters, that sort of thing. - Its job is to make sure the text looks good and is ready for display or any next steps. - It works on the text before it heads over to the TTS node or other outputs. So, to sum it up: the TTS node is your go-to for turning text into speech, while the transcription node is there to tidy up and perfect the text itself. If you want me to explain anything more or have any questions, just let me know—I’m here to help!
r
So does transcript node yield delta to trs_node?
t
Absolutely! When you enable the TTS-aligned transcript feature in LiveKit, the transcription node can send deltas—those little incremental text updates—to the TTS node. This helps keep the audio perfectly in sync with the transcript. If you'd like, I’m happy to walk you through how to set this up or chat about any specific TTS settings you’re curious about! Sources: Text and transcriptions | LiveKit Docs | examples/voice_agents/timed_agent_transcript.py | Pipeline nodes and hooks | LiveKit Docs