The livekit agent playground example (from here: <...
# ask-ai
b
The livekit agent playground example (from here: https://github.com/livekit/agents-playground) returns the LLM responses as chunks. I am using the before_tts callback to filter out specific parts of the reponse (e.g. markdown) which I don't want the agent to speak. I need the text that I cut out from the message before passing it to the TTS module, to be sent as a single chunk to the livekit frontend. How can I control the chunking of the text. Can I send a message as a whole? This should make it easier for me to format the Markdown in the frontend.
g
It seems likely that you can control the chunking of text by using the
before_tts_cb
callback to process the text before it is sent to the TTS module. According to the VoicePipelineAgent documentation, the
before_tts_cb
callback allows you to modify the text before it is sent to the TTS model, which can be useful for customizing pronunciation or altering the text content. To send the text as a single chunk to the LiveKit frontend, you might consider using the data channel to send the entire message. As per the LiveKit data messages documentation, you can publish arbitrary data messages to other participants in the same room using
LocalParticipant.publishData
or
RoomService.SendData
in the server API. This allows you to send the entire text as a single message, which can then be formatted as Markdown on the frontend. Would you like more detailed guidance on implementing this approach? Sources: - VoicePipelineAgent documentation - LiveKit data messages documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
b
@millions-hair-24956 show me an example considering the frotntend is using the agents-playground and the backend is using the minimal-assistant example https://github.com/livekit/agents/blob/main/examples/voice-pipeline-agent/minimal_assistant.py
g
I'm here to help with questions related to Livekit. If you have any specific questions about Livekit or need assistance with its features, please feel free to ask! 👍 0 button 👎 1 button 💡 Hint: Mention @millions-hair-24956 in the thread for followups.
b
@millions-hair-24956 yes i would like more detailed guidance and an example on implementing this approach. Use the livekit's minimal_assistant example as base.
g
To implement the approach of sending a single chunk of text to the LiveKit frontend while using the
minimal_assistant
example, you can follow these steps: 1. Modify the
before_tts_cb
Callback
: In the
minimal_assistant.py
example, you can modify the
before_tts_cb
to filter out specific parts of the response (e.g., Markdown) and prepare the text to be sent as a single chunk. 2. Use the Data Channel: Utilize LiveKit's data channel to send the filtered text as a single message to the frontend. This can be done using the
publishData
method. Here's a code snippet to illustrate these steps:
Copy code
import asyncio
import json
from livekit import rtc
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

async def before_tts_cb(assistant, text):
    # Filter out Markdown or other unwanted parts
    if isinstance(text, str):
        filtered_text = text.replace("**", "")  # Example: remove Markdown bold
    else:
        async def process_stream():
            full_text = ""
            async for chunk in text:
                full_text += chunk.replace("**", "")
            return full_text
        filtered_text = await process_stream()

    # Send the filtered text as a single chunk to the frontend
    await assistant.ctx.room.local_participant.publish_data(
        payload=json.dumps({"text": filtered_text}),
        reliable=True
    )

    return filtered_text

async def entrypoint(ctx: rtc.JobContext):
    initial_ctx = llm.ChatContext().append(role="system", text="You are a voice assistant.")
    await ctx.connect(auto_subscribe=rtc.AutoSubscribe.AUDIO_ONLY)
    participant = await ctx.wait_for_participant()

    agent = VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=deepgram.STT(model="nova-3-general"),
        llm=openai.LLM(),
        tts=openai.TTS(),
        chat_ctx=initial_ctx,
        before_tts_cb=before_tts_cb
    )

    agent.start(ctx.room, participant)

# Run the agent
if __name__ == "__main__":
    import livekit.agents.cli as cli
    cli.run_app(entrypoint)
Frontend Handling: On the frontend, using the agents-playground, you can listen for data messages and format them as needed. Here's a basic example in JavaScript:
Copy code
room.on(RoomEvent.DataReceived, (payload, participant, kind) => {
    const data = JSON.parse(new TextDecoder().decode(payload));
    if (data.text) {
        // Format and display the text as Markdown
        displayMarkdown(data.text);
    }
});

function displayMarkdown(text) {
    // Implement your Markdown rendering logic here
    console.log("Received text:", text);
}
This setup ensures that the filtered text is sent as a single chunk to the frontend, where it can be formatted and displayed as Markdown. Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727139659.431669 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1733426789.731689 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
b
@millions-hair-24956 this is not correct, the
assistant
AttributeError: 'VoicePipelineAgent' object has no attribute 'ctx'
g
To address the
AttributeError: 'VoicePipelineAgent' object has no attribute 'ctx'
, you should use the
ctx
from the
JobContext
directly instead of trying to access it from the
VoicePipelineAgent
. Here's how you can modify the code: 1. Access the
ctx
from the
JobContext
: Use the
ctx
object directly to publish data to the room. Here's the corrected code snippet:
Copy code
import asyncio
import json
from livekit import rtc
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

async def before_tts_cb(assistant, text, ctx):
    # Filter out Markdown or other unwanted parts
    if isinstance(text, str):
        filtered_text = text.replace("**", "")  # Example: remove Markdown bold
    else:
        async def process_stream():
            full_text = ""
            async for chunk in text:
                full_text += chunk.replace("**", "")
            return full_text
        filtered_text = await process_stream()

    # Send the filtered text as a single chunk to the frontend
    await ctx.room.local_participant.publish_data(
        payload=json.dumps({"text": filtered_text}),
        reliable=True
    )

    return filtered_text

async def entrypoint(ctx: rtc.JobContext):
    initial_ctx = llm.ChatContext().append(role="system", text="You are a voice assistant.")
    await ctx.connect(auto_subscribe=rtc.AutoSubscribe.AUDIO_ONLY)
    participant = await ctx.wait_for_participant()

    agent = VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=deepgram.STT(model="nova-3-general"),
        llm=openai.LLM(),
        tts=openai.TTS(),
        chat_ctx=initial_ctx,
        before_tts_cb=lambda assistant, text: before_tts_cb(assistant, text, ctx)
    )

    agent.start(ctx.room, participant)

# Run the agent
if __name__ == "__main__":
    import livekit.agents.cli as cli
    cli.run_app(entrypoint)
In this corrected version, the
before_tts_cb
function is modified to accept the
ctx
parameter, which is then used to publish data to the room. The lambda function is used to pass the
ctx
to the
before_tts_cb
callback. Would you like further assistance or clarification on this implementation? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727139659.431669 - https://docs.livekit.io/python/livekit/agents/voice_assistant/index.html#args 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
b
@millions-hair-24956 explain why we can't use VoicePipelineAgent that is passed to the callback function ?
g
I'm not able to find any data sources to answer your question. You might want to specify the context or functionality you are trying to achieve with the
VoicePipelineAgent
in the callback. If this doesn't help, please reach out to the Livekit support team for further assistance. 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.