describe the `Agent` class and how it interacts wi...
# ask-ai
r
describe the
Agent
class and how it interacts with
AgentSession
t
Thanks for reaching out, Jeff. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
The
Agent
class and
AgentSession
work together to create a complete voice agent system in LiveKit, with distinct roles: Agent Class: • The
Agent
class defines the behavior and capabilities of an agent. It is initialized with instructions and can optionally specify its own AI components. For example:
Copy code
class MyAgent(Agent):
    def __init__(self):
        super().__init__(
            instructions="Your name is Kelly. You would interact with users via voice. Keep your responses concise and to the point."
        )
• It includes lifecycle hooks like
on_enter()
, which are triggered when the agent becomes active:
Copy code
async def on_enter(self):
    self.session.generate_reply()
• Agents can define function tools using the
@function_tool
decorator, allowing the LLM to call specific functions:
Copy code
@function_tool
async def lookup_weather(self, context: RunContext, location: str, latitude: str, longitude: str):
    """Called when the user asks for weather-related information."""
    return "Sunny with a temperature of 70 degrees."
AgentSession Class: • The
AgentSession
orchestrates the entire voice agent system, linking audio, video, and text I/O with components like STT, VAD, TTS, and LLM. It handles turn detection, endpointing, interruptions, and multi-step tool calls. • It is initialized with various AI components and configuration options:
Copy code
session = AgentSession(
    vad=ctx.proc.userdata["vad"],
    llm=openai.LLM(model="gpt-4o-mini"),
    stt=deepgram.STT(model="nova-3", language="multi"),
    tts=openai.TTS(voice="ash"),
    turn_detection=MultilingualModel(),
)
• Key parameters include
turn_detection
,
allow_interruptions
, and
max_tool_steps
. Interaction Between Agent and AgentSession: • The session is started with an agent instance:
Copy code
await session.start(
    agent=MyAgent(),
    room=ctx.room,
)
• The agent accesses the session through
self.session
, enabling it to control session behavior:
Copy code
async def on_enter(self):
    self.session.generate_reply()
• Agents can hand off control to other agents by returning new instances from function tools:
Copy code
@function_tool
async def talk_to_echo(self, ctx: RunContext):
    """Called when the user wants to speak with Echo"""
    return EchoAgent()
This architecture separates concerns effectively:
AgentSession
handles real-time media processing and AI component orchestration, while
Agent
classes focus on defining conversational behavior, instructions, and tools. Sources: tests/test_evals.py