Which is triggered first, the `on_user_turn_comple...
# ask-ai
a
Which is triggered first, the
on_user_turn_complete
node or the
llm_node
? Can I do RAG in both?
t
Thanks for reaching out, Anthony. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Hi there! Just to help you understand better, the
on_user_turn_completed
node actually gets triggered before the
llm_node
. This is great because it gives you a chance to tweak the chat context or user message before the LLM starts processing it. You can actually use Retrieval-Augmented Generation (RAG) in both of these nodes, depending on what fits your needs best: •
on_user_turn_completed
: This is perfect for adding context before the LLM kicks in. For example, you can pull in relevant info (like through vector search) and add it to the chat context early on. This way, it’s efficient and helps you avoid any delays from calling external tools. •
llm_node
: This is where you can handle all the LLM-specific logic, like working with custom LLM providers or reshaping the context in more advanced ways. You can also do RAG here by adjusting the chat context right before the LLM crafts its response. If you’d like to see some concrete examples and get more detailed guidance, I highly recommend checking out the LiveKit RAG examples. They’re super helpful! Sources: Pipeline nodes and hooks | LiveKit Docs | External data and RAG | LiveKit Docs | Agents v0.x migration guide - Node.js | LiveKit Docs | agents/src/voice/agent.ts | rag/rag_handler.py
d
i want you to apply this two functions in this code and tell me where should i put them import logging from dataclasses import dataclass from typing import Optional from dotenv import load_dotenv from livekit import api from livekit.agents import ( Agent, AgentSession, ChatContext, JobContext, JobProcess, RoomInputOptions, RoomOutputOptions, RunContext, WorkerOptions, cli, metrics, ) from livekit.agents.job import get_job_context from livekit.agents.llm import function_tool from livekit.agents.voice import MetricsCollectedEvent from livekit.plugins import deepgram, openai, silero # uncomment to enable Krisp BVC noise cancellation, currently supported on Linux and MacOS # from livekit.plugins import noise_cancellation ## The storyteller agent is a multi-agent that can handoff the session to another agent. ## This example demonstrates more complex workflows with multiple agents. ## Each agent could have its own instructions, as well as different STT, LLM, TTS, ## or realtime models. logger = logging.getLogger("multi-agent") load_dotenv() common_instructions = ( "Your name is Echo. You are a story teller that interacts with the user via voice." "You are curious and friendly, with a sense of humor." ) @dataclass class StoryData: # Shared data that's used by the storyteller agent. # This structure is passed as a parameter to function calls. name: Optional[str] = None location: Optional[str] = None class IntroAgent(Agent): def __init__(self) -> None: super().__init__( instructions=f"{common_instructions} Your goal is to gather a few pieces of " "information from the user to make the story personalized and engaging." "You should ask the user for their name and where they are from." "Start the conversation with a short introduction.", ) async def on_enter(self): # when the agent is added to the session, it'll generate a reply # according to its instructions self.session.generate_reply() @function_tool async def information_gathered( self, context: RunContext[StoryData], name: str, location: str, ): """Called when the user has provided the information needed to make the story personalized and engaging. Args: name: The name of the user location: The location of the user """ context.userdata.name = name context.userdata.location = location story_agent = StoryAgent(name, location) # by default, StoryAgent will start with a new context, to carry through the current # chat history, pass in the chat_ctx # story_agent = StoryAgent(name, location, chat_ctx=context.chat_ctx) logger.info( "switching to the story agent with the provided user data: %s", context.userdata ) return story_agent, "Let's start the story!" class StoryAgent(Agent): def __init__(self, name: str, location: str, *, chat_ctx: Optional[ChatContext] = None) -> None: super().__init__( instructions=f"{common_instructions}. You should use the user's information in " "order to make the story personalized." "create the entire story, weaving in elements of their information, and make it " "interactive, occasionally interating with the user." "do not end on a statement, where the user is not expected to respond." "when interrupted, ask if the user would like to continue or end." f"The user's name is {name}, from {location}.", # each agent could override any of the model services, including mixing # realtime and non-realtime models llm=openai.realtime.RealtimeModel(voice="echo"), tts=None, chat_ctx=chat_ctx, ) async def on_enter(self): # when the agent is added to the session, we'll initiate the conversation by # using the LLM to generate a reply self.session.generate_reply() @function_tool async def story_finished(self, context: RunContext[StoryData]): """When you are fininshed telling the story (and the user confirms they don't want anymore), call this function to end the conversation.""" # interrupt any existing generation self.session.interrupt() # generate a goodbye message and hang up # awaiting it will ensure the message is played out before returning await self.session.generate_reply( instructions=f"say goodbye to {context.userdata.name}", allow_interruptions=False ) job_ctx = get_job_context() await job_ctx.api.room.delete_room(api.DeleteRoomRequest(room=job_ctx.room.name)) def prewarm(proc: JobProcess): proc.userdata["vad"] = silero.VAD.load() async def entrypoint(ctx: JobContext): session = AgentSession[StoryData]( vad=ctx.proc.userdata["vad"], # any combination of STT, LLM, TTS, or realtime API can be used llm=openai.LLM(model="gpt-4o-mini"), stt=deepgram.STT(model="nova-3"), tts=openai.TTS(voice="echo"), userdata=StoryData(), ) # log metrics as they are emitted, and total usage after session is over usage_collector = metrics.UsageCollector() @session.on("metrics_collected") def _on_metrics_collected(ev: MetricsCollectedEvent): metrics.log_metrics(ev.metrics) usage_collector.collect(ev.metrics) async def log_usage(): summary = usage_collector.get_summary() logger.info(f"Usage: {summary}") ctx.add_shutdown_callback(log_usage) await session.start( agent=IntroAgent(), room=ctx.room, room_input_options=RoomInputOptions( # uncomment to enable Krisp BVC noise cancellation # noise_cancellation=noise_cancellation.BVC(), ), room_output_options=RoomOutputOptions(transcription_enabled=True), ) if name == "__main__": cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint, prewarm_fnc=prewarm))
t
To help you smoothly integrate the
on_user_turn_completed
and
llm_node
functions into your code, here’s a friendly guide on where to place them: • `on_user_turn_completed`: You’ll want to add this function to either the
IntroAgent
or
StoryAgent
class, depending on where you'd like to tweak the chat context or user message before the LLM processes it. For instance, defining it as an asynchronous method within the
IntroAgent
class is a great way to handle pre-LLM context injection. • `llm_node`: This one fits nicely into the
StoryAgent
class, especially if you’re aiming to centralize all LLM-related logic—like custom LLM providers or advanced context restructuring. It works well as an asynchronous method that adjusts the chat context before the LLM crafts its response. If you'd like, I’d be happy to provide specific code snippets to show exactly how you can integrate these functions into your existing classes. Just let me know!
d
can i define the two functions out of the two classes and then call them when use ask a question?
t
Absolutely, you can define the
on_user_turn_completed
and
llm_node
functions outside of the classes and call them when needed. That said, there are a couple of important things to keep in mind: • on_user_turn_completed: This function usually ties into the lifecycle of an agent (like
IntroAgent
or
StoryAgent
). If you decide to define it externally, you'll want to make sure you pass the necessary context (such as
ChatContext
,
RunContext
, or
new_message
) explicitly when you call it. • llm_node: This one is designed to work within the LLM processing flow. If it’s defined outside, just ensure it has access to all the required parameters like
chat_ctx
,
tools
, and
model_settings
. A helpful way to do this is by defining these functions as standalone asynchronous functions, then invoking them within the appropriate methods of your agents. Would you like me to share an example to guide you through how to set this up?
d
i want the best approach
t
That’s a great question! The best approach really depends on what you’re aiming to achieve, but generally, embedding the
on_user_turn_completed
and
llm_node
functions within the relevant agent classes (like
IntroAgent
or
StoryAgent
) tends to work well. Here’s why: • Encapsulation: Keeping these functions inside the classes means they have direct access to the agent’s context, state, and lifecycle, so you won’t have to pass around extra parameters. • Readability: This approach makes your code clearer and easier to maintain since the logic stays closely connected to the agent’s behavior. • Flexibility: It also lets you override or customize these functions for specific agents without impacting others, which is super handy. That said, if you find yourself needing to reuse these functions across several agents, defining them externally as utility functions and then calling them from within your agent methods is a smart way to keep things reusable and maintainable. If you’d like, I’d be happy to walk you through detailed examples of both approaches—just let me know!
d
so the best way to define these functions is to define them inside the classes for each agent class, is that what you mean?
t
Yes, that's absolutely right! The best approach is to define the
on_user_turn_completed
and
llm_node
functions inside the relevant agent classes, like
IntroAgent
or
StoryAgent
. This way, you get to enjoy a few great benefits: • Direct Access: These functions can easily tap into the agent's context, state, and lifecycle without needing extra parameters, which keeps things smooth. • Encapsulation: Keeping the logic within the specific agent helps your code stay modular and much easier to maintain over time. • Customization: You have the flexibility to tailor how these functions behave for each agent, perfectly fitting your needs. If you'd like, I’d be happy to walk you through an example of how to define and use these functions right within the agent classes. Just let me know!