<@U084APK40NS> Using the VoicePipelineAgent, I'm u...
# ask-ai
m
@millions-hair-24956 Using the VoicePipelineAgent, I'm using before_cb_llm to modify text before sending it to the LLM. I want the modification to persist forever in the context too. But right now, the modifications I make disappear in the future when more text comes in and it's generating a new response. How can I do this?
g
It seems likely that the
before_llm_cb
callback is used to modify the chat context before it is sent to the LLM, but these modifications are not automatically persisted in the chat context for future interactions. According to the LiveKit documentation, the
before_llm_cb
allows you to modify the
ChatContext
before it is sent to the LLM model, which is useful for adding extra context or adjusting the context based on the conversation. However, to ensure that modifications persist, you may need to manually update the
ChatContext
object within the callback and ensure these changes are saved back to the context used by the
VoicePipelineAgent
. Would you like more detailed guidance on how to implement this? Sources: - LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
Yes, show me how to implement this @millions-hair-24956
g
To ensure that modifications made in the
before_llm_cb
persist in the
ChatContext
, you can manually update the
ChatContext
within the callback and ensure these changes are saved back to the context used by the
VoicePipelineAgent
. Here's a step-by-step guide on how to implement this: 1. Define the
before_llm_cb
Callback
: This function will modify the
ChatContext
and ensure the changes are saved. 2. Update the
ChatContext
: Within the callback, modify the
ChatContext
as needed and ensure these changes are appended to the context. 3. Persist Changes: After modifying the context, ensure that the updated context is used for future interactions. Here's an example implementation:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import cartesia, deepgram, openai, silero

# Initial chat context
initial_ctx = llm.ChatContext().append(role="system", text="<your prompt>")

# Define the before_llm_cb callback
def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Modify the chat context
    modified_text = "Modified: " + chat_ctx.messages[-1].content
    chat_ctx.messages[-1].content = modified_text
    
    # Persist the changes by appending the modified message
    assistant.chat_ctx.messages.append(chat_ctx.messages[-1])

    return chat_ctx

# Create the VoicePipelineAgent
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    chat_ctx=initial_ctx,
    before_llm_cb=before_llm_cb,
    allow_interruptions=True,
    interrupt_speech_duration=0.5,
    interrupt_min_words=0,
    min_endpointing_delay=0.5,
)

# Start the agent
agent.start(room, participant)
In this example, the
before_llm_cb
modifies the last message in the
ChatContext
and appends it back to the context to ensure the changes persist. This way, the modifications will be part of the context for future interactions. Sources: - LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
But won't the system also append the message itself as well? So we would have the message duplicated, one with "Modified:" and one without? Or no?
But won't the system also append the message itself as well? So we would have the message duplicated, one with "Modified:" and one without? Or no? @millions-hair-24956
g
It seems likely that if you manually append the modified message to the
ChatContext
, and the system also appends the original message, you might end up with duplicate entries. To avoid this, you should modify the existing message in place rather than appending a new one. This way, the
ChatContext
retains only the modified version of the message. Here's how you can adjust the implementation to modify the message in place:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import cartesia, deepgram, openai, silero

# Initial chat context
initial_ctx = llm.ChatContext().append(role="system", text="<your prompt>")

# Define the before_llm_cb callback
def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Modify the last message in place
    last_message = chat_ctx.messages[-1]
    last_message.content = "Modified: " + last_message.content

    # No need to append, as we're modifying in place
    return chat_ctx

# Create the VoicePipelineAgent
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    chat_ctx=initial_ctx,
    before_llm_cb=before_llm_cb,
    allow_interruptions=True,
    interrupt_speech_duration=0.5,
    interrupt_min_words=0,
    min_endpointing_delay=0.5,
)

# Start the agent
agent.start(room, participant)
By modifying the message in place, you ensure that the
ChatContext
only contains the modified version, preventing duplication. Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1730811222.681569 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
But then since we're only modifying ChatContext in place, in the future the changes are not persisted. This is the issue I'm talking about @millions-hair-24956
g
It seems likely that to ensure changes persist across future interactions, you need to update the
ChatContext
in a way that the modifications are retained for subsequent messages. This involves ensuring that the
ChatContext
used by the
VoicePipelineAgent
is consistently updated with the modified content. Here's a refined approach to ensure persistence: 1. Modify the Message: Update the message content in place within the
before_llm_cb
callback. 2. Ensure Context Consistency: After modifying the message, ensure that the
VoicePipelineAgent
uses this updated
ChatContext
for future interactions. Here's how you can implement this:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import cartesia, deepgram, openai, silero

# Initial chat context
initial_ctx = llm.ChatContext().append(role="system", text="<your prompt>")

# Define the before_llm_cb callback
def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Modify the last message in place
    last_message = chat_ctx.messages[-1]
    last_message.content = "Modified: " + last_message.content

    # Update the agent's chat context to ensure persistence
    assistant.chat_ctx = chat_ctx

    return chat_ctx

# Create the VoicePipelineAgent
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    chat_ctx=initial_ctx,
    before_llm_cb=before_llm_cb,
    allow_interruptions=True,
    interrupt_speech_duration=0.5,
    interrupt_min_words=0,
    min_endpointing_delay=0.5,
)

# Start the agent
agent.start(room, participant)
By updating the
agent.chat_ctx
with the modified context, you ensure that the changes are retained for future interactions. Sources: - LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
@millions-hair-24956 I tried that but it seems like LiveKit internally appends the message AFTER the before_llm_cb, so we end up duplicating the message. Like this:
Copy code
modified: message1
modified: message2
message2
g
It seems likely that the
before_llm_cb
is indeed modifying the message, but the system appends the original message after the callback, leading to duplication. To address this, you can modify the message in place and ensure that the system does not append the original message again. This can be done by managing the
ChatContext
carefully to prevent duplication. Here's a refined approach: 1. Modify the Message: Update the message content in place within the
before_llm_cb
callback. 2. Prevent Duplication: Ensure that the system does not append the original message again by managing the
ChatContext
correctly. Here's how you can implement this:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import cartesia, deepgram, openai, silero

# Initial chat context
initial_ctx = llm.ChatContext().append(role="system", text="<your prompt>")

# Define the before_llm_cb callback
def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Modify the last message in place
    last_message = chat_ctx.messages[-1]
    last_message.content = "Modified: " + last_message.content

    # Return False to prevent the system from appending the original message
    return False

# Create the VoicePipelineAgent
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    chat_ctx=initial_ctx,
    before_llm_cb=before_llm_cb,
    allow_interruptions=True,
    interrupt_speech_duration=0.5,
    interrupt_min_words=0,
    min_endpointing_delay=0.5,
)

# Start the agent
agent.start(room, participant)
By returning
False
from the
before_llm_cb
, you can prevent the system from appending the original message again, thus avoiding duplication. Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1730811222.681569 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
But returning False makes the agent not respond @millions-hair-24956
g
Returning
False
from the
before_llm_cb
indeed cancels the synthesis of the reply, which is not the desired outcome if you want the agent to respond. To handle this, you need to modify the
ChatContext
in a way that ensures the changes persist without preventing the agent from responding. Here's a revised approach: 1. Modify the Message: Update the message content in place within the
before_llm_cb
callback. 2. Ensure Context Consistency: Modify the
ChatContext
directly and ensure that the system uses this updated context for generating responses. Here's how you can implement this:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import cartesia, deepgram, openai, silero

# Initial chat context
initial_ctx = llm.ChatContext().append(role="system", text="<your prompt>")

# Define the before_llm_cb callback
def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Modify the last message in place
    last_message = chat_ctx.messages[-1]
    last_message.content = "Modified: " + last_message.content

    # Return the modified chat context
    return chat_ctx

# Create the VoicePipelineAgent
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    chat_ctx=initial_ctx,
    before_llm_cb=before_llm_cb,
    allow_interruptions=True,
    interrupt_speech_duration=0.5,
    interrupt_min_words=0,
    min_endpointing_delay=0.5,
)

# Start the agent
agent.start(room, participant)
By modifying the
ChatContext
in place and returning it, you ensure that the changes are used for generating responses without preventing the agent from responding. Sources: - https://docs.livekit.io/python/livekit/agents/pipeline/pipeline_agent.html#args - https://docs.livekit.io/python/livekit/agents/pipeline/index.html#args 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
Tough