boundless-summer-54220
02/17/2025, 8:19 AMgray-monkey-83525
02/17/2025, 8:19 AMbefore_llm_cb
callback in the VoicePipelineAgent
. This callback allows you to modify the chat context before it is sent to the LLM model, which can be useful for filtering or altering the text response. You can define a function that processes the chat context and apply it to the before_llm_cb
parameter when initializing the VoicePipelineAgent
.
Here's a basic example of how you might set this up:
from livekit.agents.pipeline import VoicePipelineAgent
def modify_chat_context(assistant, chat_ctx):
# Example: Remove certain words or phrases
for message in chat_ctx.messages:
message.text = message.text.replace("unwanted_phrase", "")
return chat_ctx
agent = VoicePipelineAgent(
vad=silero.VAD.load(),
stt=deepgram.STT(),
llm=openai.LLM(),
tts=cartesia.TTS(),
before_llm_cb=modify_chat_context,
before_tts_cb=replace_words, # Assuming you have a function to modify TTS text
)
Would you like more details on implementing this?
Sources:
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727139659.431669
- https://docs.livekit.io/agents/voice-agent/ 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.boundless-summer-54220
02/17/2025, 8:35 AMgray-monkey-83525
02/17/2025, 8:35 AMbefore_llm_cb
callback is used to modify the chat context before it is sent to the LLM, not the response from the LLM. This means it allows you to alter the input to the LLM, but not the output that is sent back to the frontend. If you want to modify the LLM's response before it is sent to the frontend, you would need to handle this after the LLM generates its response.
To modify the LLM's response before it is sent to the frontend, you can intercept the response after the LLM call and before it is sent to the frontend. This can be done by processing the response in the code where the LLM's output is handled, typically after the llm.chat()
method is called.
Would you like more details on how to implement this?
Sources:
- https://docs.livekit.io/agents/voice-agent/ 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.boundless-summer-54220
02/17/2025, 9:23 AMgray-monkey-83525
02/17/2025, 9:23 AMminimal_assistant.py
example, you can intercept the response after the LLM generates it and before it is sent to the frontend. Here's a step-by-step guide on how to implement this:
1. Modify the LLM Response: You can create a custom function to process the LLM's response. This function can be called after the LLM generates its response but before it is sent to the frontend.
2. Integrate the Custom Function: Use the before_llm_cb
to modify the input to the LLM and then process the output after the LLM call.
Here's an example implementation:
import asyncio
import logging
from livekit import rtc
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero
# Load environment variables and set up logging
logger = logging.getLogger("voice-assistant")
# Function to modify the LLM's response
def modify_llm_response(response_text):
# Example: Remove certain words or phrases from the response
return response_text.replace("unwanted_phrase", "")
# Function to modify the chat context before sending to LLM
def modify_chat_context(assistant, chat_ctx):
# Example: Modify the chat context if needed
return chat_ctx
async def entrypoint(ctx: rtc.JobContext):
initial_ctx = llm.ChatContext().append(
role="system",
text="You are a voice assistant created by LiveKit. Your interface with users will be voice."
)
# Connect to the room and wait for a participant
await ctx.connect(auto_subscribe=rtc.AutoSubscribe.AUDIO_ONLY)
participant = await ctx.wait_for_participant()
# Initialize the VoicePipelineAgent
agent = VoicePipelineAgent(
vad=silero.VAD.load(),
stt=deepgram.STT(model="nova-3-general"),
llm=openai.LLM(),
tts=cartesia.TTS(),
chat_ctx=initial_ctx,
before_llm_cb=modify_chat_context,
)
# Start the agent
agent.start(ctx.room, participant)
# Process the LLM's response
async def process_llm_response():
while True:
response = await agent.llm.chat(chat_ctx=agent.chat_ctx)
modified_response = modify_llm_response(response)
# Send the modified response to the frontend
await agent.say(modified_response)
# Run the response processing in the background
asyncio.create_task(process_llm_response())
In this example, modify_llm_response
is a function that processes the LLM's response to remove or alter specific text. The process_llm_response
function handles the LLM's output, modifies it, and then sends it to the frontend using agent.say
.
Would you like further clarification on any part of this implementation?
Sources:
- VoicePipelineAgent Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.boundless-summer-54220
02/17/2025, 9:26 AMgray-monkey-83525
02/17/2025, 9:26 AMprocess_llm_response()
function is designed to be called in a loop to continuously process responses from the LLM. However, in the provided example, the function is not correctly integrated into the VoicePipelineAgent
workflow. The process_llm_response()
function is intended to asynchronously handle the LLM's output, modify it, and then send it to the frontend.
In the current setup, process_llm_response()
is called as a background task using asyncio.create_task()
. This means it runs concurrently with other tasks, continuously waiting for and processing LLM responses. However, the example lacks a proper mechanism to trigger process_llm_response()
upon receiving a new LLM response. Typically, you would integrate this within the agent's event loop or response handling mechanism to ensure it processes each response as it is generated.
Would you like more detailed guidance on how to properly integrate this function into the agent's workflow?
Sources:
- https://docs.livekit.io/python/livekit/agents/pipeline/pipeline_agent.html#args 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.boundless-summer-54220
02/17/2025, 9:28 AMgray-monkey-83525
02/17/2025, 9:28 AMbefore_llm_cb
to modify the input and then handle the output directly after the LLM call. Here's how you can achieve this using the minimal_assistant.py
example as a base:
1. Modify the LLM Response: Create a function to process the LLM's response.
2. Integrate the Response Handling: Use the llm.chat()
method to get the response, modify it, and then send the modified response to the frontend.
Here's a working example:
import asyncio
import logging
from livekit import rtc
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero, cartesia
# Load environment variables and set up logging
logger = logging.getLogger("voice-assistant")
# Function to modify the LLM's response
def modify_llm_response(response_text):
# Example: Remove certain words or phrases from the response
return response_text.replace("unwanted_phrase", "")
# Function to modify the chat context before sending to LLM
def modify_chat_context(assistant, chat_ctx):
# Example: Modify the chat context if needed
return chat_ctx
async def entrypoint(ctx: rtc.JobContext):
initial_ctx = llm.ChatContext().append(
role="system",
text="You are a voice assistant created by LiveKit. Your interface with users will be voice."
)
# Connect to the room and wait for a participant
await ctx.connect(auto_subscribe=rtc.AutoSubscribe.AUDIO_ONLY)
participant = await ctx.wait_for_participant()
# Initialize the VoicePipelineAgent
agent = VoicePipelineAgent(
vad=silero.VAD.load(),
stt=deepgram.STT(model="nova-3-general"),
llm=openai.LLM(),
tts=cartesia.TTS(),
chat_ctx=initial_ctx,
before_llm_cb=modify_chat_context,
)
# Start the agent
agent.start(ctx.room, participant)
# Function to handle and modify LLM response
async def handle_llm_response():
while True:
# Wait for a new message from the LLM
response_stream = agent.llm.chat(chat_ctx=agent.chat_ctx)
async for chunk in response_stream:
for choice in chunk.choices:
if choice.delta.content:
# Modify the response
modified_response = modify_llm_response(choice.delta.content)
# Send the modified response to the frontend
await agent.say(modified_response)
# Run the response handling in the background
asyncio.create_task(handle_llm_response())
gray-monkey-83525
02/17/2025, 9:29 AMmodify_llm_response
, and sends the modified response to the frontend using agent.say
.
- Integration: The handle_llm_response
function is started as a background task using asyncio.create_task()
, ensuring it runs concurrently with other tasks.
This setup ensures that only the modified response is sent to the frontend, not the original LLM response.
Would you like further clarification on any part of this implementation?
Sources:
- VoicePipelineAgent Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.