When I am streaming where I can clean the text Probably a pa LiveKit Community #ask-ai

Join Slack

When I am streaming where I can clean the text Pr...

# ask-ai

rhythmic-jordan-68259

08/08/2025, 12:40 PM

When I am streaming where I can clean the text Probably a paragraph

tall-airline-16756

08/08/2025, 12:40 PM

Thanks for reaching out, Umair. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

rhythmic-jordan-68259

08/08/2025, 2:18 PM

Please

tall-airline-16756

08/08/2025, 2:18 PM

Hey there! If you want to clean up text during streaming using the transcription node in LiveKit, you can easily override the

transcription_node

method in your custom agent class. This way, you can process and tidy up the text in real-time. Here’s a simple example to get you started:

Copy code

from livekit.agents import Agent, ModelSettings
from typing import AsyncIterable

class CustomAgent(Agent):
    async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
        """Clean up the transcribed text by replacing certain words and removing markdown"""
        replacements = {
            "example_word": "replacement",
            # Feel free to add more replacements here
        }

        async def process_text():
            async for chunk in text:
                modified_chunk = chunk
                # Go through each replacement and apply it
                for original, replacement in replacements.items():
                    modified_chunk = modified_chunk.replace(original, replacement)
                yield modified_chunk

        return process_text()

Also, if you want to easily strip out markdown formatting, you can use the handy built-in

filter_markdown

utility like this:

Copy code

from livekit.agents.voice.transcription.filters import filter_markdown

class CustomAgent(Agent):
    async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
        """Clean up the transcribed text by removing markdown and other unwanted characters"""
        return filter_markdown(text)

If you have any questions or want me to walk you through more examples, just let me know—I’m here to help! Sources: livekit-agents/livekit/agents/voice/agent_activity.py

rhythmic-jordan-68259

08/08/2025, 2:20 PM

If i am using llmAdapter with langgraph then what i can do?

tall-airline-16756

08/08/2025, 2:20 PM

Hey there! When you're using

LLMAdapter

with LangGraph, cleaning text during streaming can be done in a few friendly ways. Here’s how you can do it: 1. Using `transcription_node`: You can override the

transcription_node

in your agent to tidy up the text as it streams. Here’s a quick example:

Copy code

from livekit.agents import Agent, ModelSettings
from typing import AsyncIterable

class MyAgent(Agent):
    async def transcription_node(self, text: AsyncIterable[str], model_settings: ModelSettings) -> AsyncIterable[str]:
        async for delta in text:
            cleaned_delta = delta.replace("😘", "")  # Just an example: removing certain characters
            yield cleaned_delta

1. Using `before_tts_cb`: Another great way is to use a callback that cleans the text right before it goes to the TTS engine. Check this out:

Copy code

from livekit.agents import VoicePipelineAgent
from typing import AsyncIterable

async def _before_tts_callback(agent: VoicePipelineAgent, text: str | AsyncIterable[str]) -> str | AsyncIterable[str]:
    if isinstance(text, str):
        return text.replace("start", "").replace("end", "")
    else:
        async def process_stream():
            async for chunk in text:
                yield chunk.replace("start", "").replace("end", "")
        return process_stream()

1. Using LangGraph Workflow: You can also tweak your LangGraph workflow to add text cleaning steps. Here’s a simple example:

Copy code

from langgraph.graph import StateGraph
from livekit.plugins import langchain

def create_workflow():
    workflow = StateGraph(...)
    # Add nodes to clean text (like removing special characters)
    return workflow.compile()

session = AgentSession(
    llm=langchain.LLMAdapter(
        graph=create_workflow()
    ),
    # ... other configurations ...
)

If you want to dive deeper or have any questions about these methods, just let me know—I’m here to help! Sources: Page not found | LiveKit Docs | Pipeline nodes and hooks | LiveKit Docs | chain-of-thought-tts/README.mdoc

rhythmic-jordan-68259

08/08/2025, 2:23 PM

So in before_tts cb are we getting whole text or chunks?

tall-airline-16756

08/08/2025, 2:23 PM

Hey there! Just wanted to share that the

before_tts_cb

in LiveKit is pretty flexible — it can handle both the entire text at once or in smaller chunks. Here’s how it works: • If you give it a string, it processes the whole text in one go. • If you pass an

AsyncIterable[str]

, it works through the text bit by bit, which is great for making real-time tweaks during the text-to-speech process. This way, it’s compatible with both non-streaming and streaming text scenarios. If you want me to explain anything further, just let me know—I’m here to help!

rhythmic-jordan-68259

08/08/2025, 2:27 PM

If i have text like

- Aug 15 Monday 10AM- 11AM

What is the best way to remove - from the start but not from 10-11

tall-airline-16756

08/08/2025, 2:28 PM

Hey there! If you want to remove the dash (

) just from the start of your text but keep the ones in places like

10AM-11AM

, you can easily do that with a little string manipulation. Here’s a quick example in Python:

Copy code

text = "- Aug 15 Monday 10AM-11AM"
if text.startswith("-"):
    text = text[1:].lstrip()  # This takes off the dash at the beginning and any extra spaces
print(text)  # You’ll get: "Aug 15 Monday 10AM-11AM"

This way, only the dash at the very start gets removed and the rest stay just as they are. If you want me to help with anything else, just let me know—I’m here for you!

rhythmic-jordan-68259

08/08/2025, 2:36 PM

tall-airline-16756

08/08/2025, 2:37 PM

rhythmic-jordan-68259

08/08/2025, 2:38 PM

What is the diff between tts node and transcript node

tall-airline-16756

08/08/2025, 2:38 PM

Hey there! Just wanted to help you understand the difference between the TTS (Text-to-Speech) node and the transcription node in LiveKit—they each play their own special role in the process: • TTS Node: - This one takes your text and turns it into audio using the TTS engine you’ve chosen. - It’s all about speech synthesis, so it transforms written words into spoken ones. - You can tweak it with things like text chunking, custom TTS engines, or audio processing. - It works after the LLM generates the text, creating the final audio you hear. • Transcription Node: - This node focuses on polishing up the text before it goes anywhere. - It helps clean up the transcription—fixing formatting, removing any unwanted characters, that sort of thing. - Its job is to make sure the text looks good and is ready for display or any next steps. - It works on the text before it heads over to the TTS node or other outputs. So, to sum it up: the TTS node is your go-to for turning text into speech, while the transcription node is there to tidy up and perfect the text itself. If you want me to explain anything more or have any questions, just let me know—I’m here to help!

rhythmic-jordan-68259

08/08/2025, 2:41 PM

So does transcript node yield delta to trs_node?

tall-airline-16756

08/08/2025, 2:42 PM

Absolutely! When you enable the TTS-aligned transcript feature in LiveKit, the transcription node can send deltas—those little incremental text updates—to the TTS node. This helps keep the audio perfectly in sync with the transcript. If you'd like, I’m happy to walk you through how to set this up or chat about any specific TTS settings you’re curious about! Sources: Text and transcriptions | LiveKit Docs | examples/voice_agents/timed_agent_transcript.py | Pipeline nodes and hooks | LiveKit Docs

Open in Slack

Previous Next