describe the `Agent` class and how it interacts with `AgentS LiveKit Community #ask-ai

describe the `Agent` class and how it interacts wi...

refined-van-59846

07/26/2025, 3:04 PM

describe the

Agent

class and how it interacts with

AgentSession

tall-airline-16756

07/26/2025, 3:04 PM

Thanks for reaching out, Jeff. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/26/2025, 3:05 PM

The

Agent

class and

AgentSession

work together to create a complete voice agent system in LiveKit, with distinct roles: Agent Class: • The

Agent

class defines the behavior and capabilities of an agent. It is initialized with instructions and can optionally specify its own AI components. For example:

Copy code

class MyAgent(Agent):
    def __init__(self):
        super().__init__(
            instructions="Your name is Kelly. You would interact with users via voice. Keep your responses concise and to the point."
        )

• It includes lifecycle hooks like

on_enter()

, which are triggered when the agent becomes active:

Copy code

async def on_enter(self):
    self.session.generate_reply()

• Agents can define function tools using the

@function_tool

decorator, allowing the LLM to call specific functions:

Copy code

@function_tool
async def lookup_weather(self, context: RunContext, location: str, latitude: str, longitude: str):
    """Called when the user asks for weather-related information."""
    return "Sunny with a temperature of 70 degrees."

AgentSession Class: • The

AgentSession

orchestrates the entire voice agent system, linking audio, video, and text I/O with components like STT, VAD, TTS, and LLM. It handles turn detection, endpointing, interruptions, and multi-step tool calls. • It is initialized with various AI components and configuration options:

Copy code

session = AgentSession(
    vad=ctx.proc.userdata["vad"],
    llm=openai.LLM(model="gpt-4o-mini"),
    stt=deepgram.STT(model="nova-3", language="multi"),
    tts=openai.TTS(voice="ash"),
    turn_detection=MultilingualModel(),
)

• Key parameters include

turn_detection

allow_interruptions

, and

max_tool_steps

. Interaction Between Agent and AgentSession: • The session is started with an agent instance:

Copy code

await session.start(
    agent=MyAgent(),
    room=ctx.room,
)

• The agent accesses the session through

self.session

, enabling it to control session behavior:

Copy code

async def on_enter(self):
    self.session.generate_reply()

• Agents can hand off control to other agents by returning new instances from function tools:

Copy code

@function_tool
async def talk_to_echo(self, ctx: RunContext):
    """Called when the user wants to speak with Echo"""
    return EchoAgent()

This architecture separates concerns effectively:

AgentSession

handles real-time media processing and AI component orchestration, while

Agent

classes focus on defining conversational behavior, instructions, and tools. Sources: tests/test_evals.py

Open in Slack

Previous Next