<@U084APK40NS> I have a agent answering phone call...
# ask-ai
w
@millions-hair-24956 I have a agent answering phone calls using VoicePipelineAgent, but llm calls takes time, how can i add another agent to same room that provides some filler message according to last message user said, how can i achieve this?
g
It seems likely that you can achieve this by using the
VoicePipelineAgent
to handle the main conversation and adding a secondary agent to provide filler messages. You can use the
before_llm_cb
callback to modify the chat context or insert filler messages before the LLM processes the input. This allows you to manage the conversation flow and provide immediate responses while the LLM processes the main query. According to the LiveKit documentation, you can customize the behavior of the
VoicePipelineAgent
by using callbacks like
before_llm_cb
to alter the chat context or insert additional messages. This can be useful for adding filler content or managing the conversation flow while waiting for the LLM to respond. Would you like more detailed guidance on setting up the callback or managing multiple agents in a room? Sources: - LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
w
@millions-hair-24956 Can you provide the code snippet for this?
g
Certainly! Below is a code snippet that demonstrates how to use the
before_llm_cb
callback in the
VoicePipelineAgent
to insert a filler message before the LLM processes the input. This can help manage the conversation flow by providing immediate feedback to the user.
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

# Define a callback function to insert a filler message
def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Insert a filler message based on the last user message
    last_message = chat_ctx.messages[-1]
    if last_message.role == 'user':
        filler_message = "Please hold on while I process your request."
        assistant.say(filler_message, allow_interruptions=True)
    return None  # Continue with the default LLM processing

# Initialize the VoicePipelineAgent
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=openai.LLM(),
    tts=openai.TTS(),
    chat_ctx=llm.ChatContext().append(role="system", text="You are a helpful assistant."),
    before_llm_cb=before_llm_cb
)

# Start the agent in a room with a participant
agent.start(room, participant)
This code sets up a
VoicePipelineAgent
with a
before_llm_cb
callback that provides a filler message to the user. The
say
method is used to deliver the filler message, allowing the agent to continue processing the user's request in the background. Would you like more details on any specific part of this setup? Sources: - LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
w
@millions-hair-24956 how to use an agent that uses llm and generate filler messages with respect to what user said in that last message
g
To generate filler messages based on the user's last message, you can create a custom LLM that processes the user's input and generates a contextually relevant filler message. This can be integrated into the
VoicePipelineAgent
using the
before_llm_cb
callback. Here's a code snippet to illustrate this approach:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

# Define a custom LLM class to generate filler messages
class CustomLLM(llm.LLM):
    def __init__(self):
        super().__init__()
        self._client = openai.LLM(model="gpt-3.5-turbo")

    async def generate_filler(self, last_message: str) -> str:
        # Use the LLM to generate a filler message based on the last user message
        response = await self._client.chat(
            chat_ctx=llm.ChatContext().append(role="user", text=last_message),
            temperature=0.5
        )
        return f"While I process your request, here's something to consider: {response.choices[0].message.content}"

# Define a callback function to insert a filler message
async def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Get the last user message
    last_message = chat_ctx.messages[-1].text if chat_ctx.messages else ""
    if last_message:
        # Generate a filler message using the custom LLM
        filler_message = await assistant.llm.generate_filler(last_message)
        await assistant.say(filler_message, allow_interruptions=True)

# Initialize the VoicePipelineAgent with the custom LLM
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=CustomLLM(),
    tts=openai.TTS(),
    chat_ctx=llm.ChatContext().append(role="system", text="You are a helpful assistant."),
    before_llm_cb=before_llm_cb
)

# Start the agent in a room with a participant
agent.start(room, participant)
This code snippet demonstrates how to use a custom LLM to generate filler messages based on the user's last message. The
before_llm_cb
callback is used to insert these messages into the conversation flow, providing immediate feedback while the main LLM processes the user's request. Would you like more details on any specific part of this setup? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1730355369.957779 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1732128458.518759 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
w
@millions-hair-24956 the filler message is getting called after the response from llm and not before the response from llm when llm is processing
g
To ensure that the filler message is played before the LLM response, you need to make sure the filler message is generated and played immediately when the user's message is received, before the LLM starts processing. This can be achieved by using the
before_llm_cb
callback to insert the filler message and then proceed with the LLM processing. Here's how you can adjust the code:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

# Define a custom LLM class to generate filler messages
class CustomLLM(llm.LLM):
    def __init__(self):
        super().__init__()
        self._client = openai.LLM(model="gpt-3.5-turbo")

    async def generate_filler(self, last_message: str) -> str:
        # Use the LLM to generate a filler message based on the last user message
        response = await self._client.chat(
            chat_ctx=llm.ChatContext().append(role="user", text=last_message),
            temperature=0.5
        )
        return f"While I process your request, here's something to consider: {response.choices[0].message.content}"

# Define a callback function to insert a filler message
async def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Get the last user message
    last_message = chat_ctx.messages[-1].text if chat_ctx.messages else ""
    if last_message:
        # Generate a filler message using the custom LLM
        filler_message = await assistant.llm.generate_filler(last_message)
        await assistant.say(filler_message, allow_interruptions=True)
    return None  # Proceed with the default LLM processing

# Initialize the VoicePipelineAgent with the custom LLM
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=CustomLLM(),
    tts=openai.TTS(),
    chat_ctx=llm.ChatContext().append(role="system", text="You are a helpful assistant."),
    before_llm_cb=before_llm_cb
)

# Start the agent in a room with a participant
agent.start(room, participant)
In this setup, the
before_llm_cb
callback is used to generate and play the filler message immediately when the user's message is received. The
return None
statement ensures that the LLM processing continues after the filler message is played. Would you like more details on any specific part of this setup? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1730738095.494029 - https://docs.livekit.io/python/livekit/agents/pipeline/pipeline_agent.html#args 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
w
@millions-hair-24956 using the code you provided the filler message is getting called after the response from llm and not before the response from llm when llm is processing
g
It seems likely that the issue arises because the filler message is being queued and played after the LLM response due to the asynchronous nature of the operations. To ensure the filler message is played immediately, you can separate the filler message generation and playback from the LLM processing. Here's a revised approach: 1. Generate and play the filler message immediately when the user's message is received. 2. Proceed with the LLM processing after the filler message is played. Here's how you can implement this:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

# Define a custom LLM class to generate filler messages
class CustomLLM(llm.LLM):
    def __init__(self):
        super().__init__()
        self._client = openai.LLM(model="gpt-3.5-turbo")

    async def generate_filler(self, last_message: str) -> str:
        # Use the LLM to generate a filler message based on the last user message
        response = await self._client.chat(
            chat_ctx=llm.ChatContext().append(role="user", text=last_message),
            temperature=0.5
        )
        return f"While I process your request, here's something to consider: {response.choices[0].message.content}"

# Define a callback function to insert a filler message
async def before_llm_cb(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    # Get the last user message
    last_message = chat_ctx.messages[-1].text if chat_ctx.messages else ""
    if last_message:
        # Generate a filler message using the custom LLM
        filler_message = await assistant.llm.generate_filler(last_message)
        # Play the filler message immediately
        await assistant.say(filler_message, allow_interruptions=True)
    return None  # Proceed with the default LLM processing

# Initialize the VoicePipelineAgent with the custom LLM
agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=CustomLLM(),
    tts=openai.TTS(),
    chat_ctx=llm.ChatContext().append(role="system", text="You are a helpful assistant."),
    before_llm_cb=before_llm_cb
)

# Start the agent in a room with a participant
agent.start(room, participant)
In this setup, the
before_llm_cb
callback ensures that the filler message is generated and played immediately before the LLM starts processing the user's message. This should help in providing immediate feedback to the user while the LLM processes the main request. Would you like more details on any specific part of this setup? Sources: - LiveKit documentation - LiveKit Slack discussion 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
w
@millions-hair-24956 Create two VoicepipelineAgents in a room for a call where one agent is responseible to answer the user queries and another agent is responsible to provide the filler messages while main agent is processing
g
It seems likely that you can achieve this by creating two separate
VoicePipelineAgent
instances in the same room. One agent will handle the main conversation, and the other will provide filler messages. You can manage the coordination between these agents using event handlers to ensure that the filler agent provides messages while the main agent processes the user's queries. Here's a conceptual approach to setting this up: 1. Main Agent: Handles the user's queries and processes them using the LLM. 2. Filler Agent: Provides filler messages when the main agent is processing. Here's a code snippet to illustrate this setup:
Copy code
from livekit.agents import llm, VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

# Define a custom LLM class for the filler agent
class FillerLLM(llm.LLM):
    async def generate_filler(self) -> str:
        # Generate a simple filler message
        return "Please hold on while I process your request."

# Initialize the main agent
main_agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=openai.LLM(),
    tts=openai.TTS(),
    chat_ctx=llm.ChatContext().append(role="system", text="You are a helpful assistant.")
)

# Initialize the filler agent
filler_agent = VoicePipelineAgent(
    vad=silero.VAD.load(),
    stt=deepgram.STT(model="nova-2-general"),
    llm=FillerLLM(),
    tts=openai.TTS(),
    chat_ctx=llm.ChatContext().append(role="system", text="I provide filler messages.")
)

# Define a function to handle the main agent's processing
async def handle_main_agent_processing():
    # Simulate processing delay
    await asyncio.sleep(2)  # Simulate processing time
    # Notify the filler agent to provide a message
    filler_message = await filler_agent.llm.generate_filler()
    await filler_agent.say(filler_message, allow_interruptions=True)

# Start both agents in the same room
main_agent.start(room, participant)
filler_agent.start(room, participant)

# Example usage: When the main agent starts processing, trigger the filler agent
await handle_main_agent_processing()
In this setup, the
FillerLLM
class is used to generate simple filler messages. The
handle_main_agent_processing
function simulates the main agent's processing and triggers the filler agent to provide a message during this time. Would you like more details on any specific part of this setup? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1730764131.357979 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
w
@millions-hair-24956 provide complete code to handle in entrypoint
g
To set up two
VoicePipelineAgent
instances in a room, where one agent handles user queries and the other provides filler messages, you can manage this within the
entrypoint
function. Below is a complete example of how you might implement this:
Copy code
import asyncio
from livekit.agents import llm, JobContext, WorkerOptions, cli
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

# Define a custom LLM class for the filler agent
class FillerLLM(llm.LLM):
    async def generate_filler(self) -> str:
        # Generate a simple filler message
        return "Please hold on while I process your request."

async def entrypoint(ctx: JobContext):
    await ctx.connect()

    # Wait for the first participant to connect
    participant = await ctx.wait_for_participant()

    # Initialize the main agent
    main_agent = VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=deepgram.STT(model="nova-2-general"),
        llm=openai.LLM(),
        tts=openai.TTS(),
        chat_ctx=llm.ChatContext().append(role="system", text="You are a helpful assistant.")
    )

    # Initialize the filler agent
    filler_agent = VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=deepgram.STT(model="nova-2-general"),
        llm=FillerLLM(),
        tts=openai.TTS(),
        chat_ctx=llm.ChatContext().append(role="system", text="I provide filler messages.")
    )

    # Start both agents in the same room
    main_agent.start(ctx.room, participant)
    filler_agent.start(ctx.room, participant)

    # Define a function to handle the main agent's processing
    async def handle_main_agent_processing():
        # Simulate processing delay
        await asyncio.sleep(2)  # Simulate processing time
        # Notify the filler agent to provide a message
        filler_message = await filler_agent.llm.generate_filler()
        await filler_agent.say(filler_message, allow_interruptions=True)

    # Example usage: When the main agent starts processing, trigger the filler agent
    await handle_main_agent_processing()

if __name__ == "__main__":
    cli.run_app(
        WorkerOptions(
            entrypoint_fnc=entrypoint
        )
    )
Explanation: - Main Agent: Handles the user's queries using the
openai.LLM
. - Filler Agent: Uses a custom
FillerLLM
to generate and provide filler messages. - Entrypoint Function: Connects to the room, waits for a participant, and starts both agents. - Processing Simulation: The
handle_main_agent_processing
function simulates a delay and triggers the filler agent to provide a message. This setup allows the filler agent to provide messages while the main agent processes the user's queries. Adjust the
asyncio.sleep
duration to simulate different processing times. Would you like more details on any specific part of this setup? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1733148578.269969 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729785405.894609 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.