The livekit agent playground example from here <https github LiveKit Community #ask-ai

The livekit agent playground example (from here: <...

boundless-summer-54220

02/16/2025, 6:15 PM

The livekit agent playground example (from here: https://github.com/livekit/agents-playground) returns the LLM responses as chunks. I am using the before_tts callback to filter out specific parts of the reponse (e.g. markdown) which I don't want the agent to speak. I need the text that I cut out from the message before passing it to the TTS module, to be sent as a single chunk to the livekit frontend. How can I control the chunking of the text. Can I send a message as a whole? This should make it easier for me to format the Markdown in the frontend.

gray-monkey-83525

02/16/2025, 6:15 PM

It seems likely that you can control the chunking of text by using the

before_tts_cb

callback to process the text before it is sent to the TTS module. According to the VoicePipelineAgent documentation, the

before_tts_cb

callback allows you to modify the text before it is sent to the TTS model, which can be useful for customizing pronunciation or altering the text content. To send the text as a single chunk to the LiveKit frontend, you might consider using the data channel to send the entire message. As per the LiveKit data messages documentation, you can publish arbitrary data messages to other participants in the same room using

LocalParticipant.publishData

RoomService.SendData

in the server API. This allows you to send the entire text as a single message, which can then be formatted as Markdown on the frontend. Would you like more detailed guidance on implementing this approach? Sources: - VoicePipelineAgent documentation - LiveKit data messages documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

boundless-summer-54220

02/16/2025, 6:20 PM

@millions-hair-24956 show me an example considering the frotntend is using the agents-playground and the backend is using the minimal-assistant example https://github.com/livekit/agents/blob/main/examples/voice-pipeline-agent/minimal_assistant.py

gray-monkey-83525

02/16/2025, 6:20 PM

I'm here to help with questions related to Livekit. If you have any specific questions about Livekit or need assistance with its features, please feel free to ask! 👍 0 button 👎 1 button 💡 Hint: Mention @millions-hair-24956 in the thread for followups.

boundless-summer-54220

02/16/2025, 6:21 PM

@millions-hair-24956 yes i would like more detailed guidance and an example on implementing this approach. Use the livekit's minimal_assistant example as base.

gray-monkey-83525

02/16/2025, 6:21 PM

To implement the approach of sending a single chunk of text to the LiveKit frontend while using the

minimal_assistant

example, you can follow these steps: 1. Modify the
before_tts_cb
Callback: In the

minimal_assistant.py

example, you can modify the

before_tts_cb

to filter out specific parts of the response (e.g., Markdown) and prepare the text to be sent as a single chunk. 2. Use the Data Channel: Utilize LiveKit's data channel to send the filtered text as a single message to the frontend. This can be done using the

publishData

method. Here's a code snippet to illustrate these steps:

Copy code

import asyncio
import json
from livekit import rtc
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

async def before_tts_cb(assistant, text):
    # Filter out Markdown or other unwanted parts
    if isinstance(text, str):
        filtered_text = text.replace("**", "")  # Example: remove Markdown bold
    else:
        async def process_stream():
            full_text = ""
            async for chunk in text:
                full_text += chunk.replace("**", "")
            return full_text
        filtered_text = await process_stream()

    # Send the filtered text as a single chunk to the frontend
    await assistant.ctx.room.local_participant.publish_data(
        payload=json.dumps({"text": filtered_text}),
        reliable=True
    )

    return filtered_text

async def entrypoint(ctx: rtc.JobContext):
    initial_ctx = llm.ChatContext().append(role="system", text="You are a voice assistant.")
    await ctx.connect(auto_subscribe=rtc.AutoSubscribe.AUDIO_ONLY)
    participant = await ctx.wait_for_participant()

    agent = VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=deepgram.STT(model="nova-3-general"),
        llm=openai.LLM(),
        tts=openai.TTS(),
        chat_ctx=initial_ctx,
        before_tts_cb=before_tts_cb
    )

    agent.start(ctx.room, participant)

# Run the agent
if __name__ == "__main__":
    import livekit.agents.cli as cli
    cli.run_app(entrypoint)

Frontend Handling: On the frontend, using the agents-playground, you can listen for data messages and format them as needed. Here's a basic example in JavaScript:

Copy code

room.on(RoomEvent.DataReceived, (payload, participant, kind) => {
    const data = JSON.parse(new TextDecoder().decode(payload));
    if (data.text) {
        // Format and display the text as Markdown
        displayMarkdown(data.text);
    }
});

function displayMarkdown(text) {
    // Implement your Markdown rendering logic here
    console.log("Received text:", text);
}

gray-monkey-83525

02/16/2025, 6:22 PM

This setup ensures that the filtered text is sent as a single chunk to the frontend, where it can be formatted and displayed as Markdown. Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727139659.431669 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1733426789.731689 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

boundless-summer-54220

02/16/2025, 6:29 PM

@millions-hair-24956 this is not correct, the

assistant

AttributeError: 'VoicePipelineAgent' object has no attribute 'ctx'

gray-monkey-83525

02/16/2025, 6:29 PM

To address the

AttributeError: 'VoicePipelineAgent' object has no attribute 'ctx'

, you should use the

ctx

from the

JobContext

directly instead of trying to access it from the

VoicePipelineAgent

. Here's how you can modify the code: 1. Access the
ctx
from the
JobContext
: Use the

ctx

object directly to publish data to the room. Here's the corrected code snippet:

Copy code

import asyncio
import json
from livekit import rtc
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

async def before_tts_cb(assistant, text, ctx):
    # Filter out Markdown or other unwanted parts
    if isinstance(text, str):
        filtered_text = text.replace("**", "")  # Example: remove Markdown bold
    else:
        async def process_stream():
            full_text = ""
            async for chunk in text:
                full_text += chunk.replace("**", "")
            return full_text
        filtered_text = await process_stream()

    # Send the filtered text as a single chunk to the frontend
    await ctx.room.local_participant.publish_data(
        payload=json.dumps({"text": filtered_text}),
        reliable=True
    )

    return filtered_text

async def entrypoint(ctx: rtc.JobContext):
    initial_ctx = llm.ChatContext().append(role="system", text="You are a voice assistant.")
    await ctx.connect(auto_subscribe=rtc.AutoSubscribe.AUDIO_ONLY)
    participant = await ctx.wait_for_participant()

    agent = VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=deepgram.STT(model="nova-3-general"),
        llm=openai.LLM(),
        tts=openai.TTS(),
        chat_ctx=initial_ctx,
        before_tts_cb=lambda assistant, text: before_tts_cb(assistant, text, ctx)
    )

    agent.start(ctx.room, participant)

# Run the agent
if __name__ == "__main__":
    import livekit.agents.cli as cli
    cli.run_app(entrypoint)

In this corrected version, the

before_tts_cb

function is modified to accept the

ctx

parameter, which is then used to publish data to the room. The lambda function is used to pass the

ctx

to the

before_tts_cb

callback. Would you like further assistance or clarification on this implementation? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727139659.431669 - https://docs.livekit.io/python/livekit/agents/voice_assistant/index.html#args 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

boundless-summer-54220

02/16/2025, 6:39 PM

@millions-hair-24956 explain why we can't use VoicePipelineAgent that is passed to the callback function ?

gray-monkey-83525

02/16/2025, 6:39 PM

I'm not able to find any data sources to answer your question. You might want to specify the context or functionality you are trying to achieve with the

VoicePipelineAgent

in the callback. If this doesn't help, please reach out to the Livekit support team for further assistance. 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

Open in Slack

Previous Next