rhythmic-printer-24192
08/19/2025, 5:30 PMtall-airline-16756
08/19/2025, 5:30 PMtall-airline-16756
08/19/2025, 5:31 PMbefore_llm_cb
callback in the VoicePipelineAgent
. This lets you tweak the ChatContext
just before it gets sent to the LLM.
For instance, if you want to keep things simple, you can truncate the context to only include the last 15 messages like this:
async def truncate_context(assistant: VoicePipelineAgent, chat_ctx: llm.ChatContext):
if len(chat_ctx.messages) > 15:
chat_ctx.messages = chat_ctx.messages[-15:]
agent = VoicePipelineAgent(
...
before_llm_cb=truncate_context,
)
There are other cool ways to manage this too, like cutting off based on token count, filtering by content type, or keeping a sliding window of the most recent conversation turns. If you want a hand setting any of these up, just give me a shout—I’m here to help!
Sources: Vision Agent Quickstart | LiveKit Docs | VoicePipelineAgent | LiveKit Docsrhythmic-printer-24192
08/19/2025, 5:34 PMtall-airline-16756
08/19/2025, 5:35 PMAgent
, you’ve got some great options to manage the context window using the built-in context management features of the Agent
class. Here are a few friendly approaches you can try:
1. Set Initial Context: You can pass a limited ChatContext
to your agent when you initialize it, which helps control the size of the initial context.
2. Dynamic Context Management: Feel free to use the chat_ctx
property to access and update the context on the fly while your agent is running.
3. Override the LLM Node: If you want to get a bit more hands-on, you can customize how the context window works by overriding the llm_node
method. For example, applying a sliding window to keep only the last N messages can be really handy:
These options give you some flexible ways to manage your context window effectively. If you want to dive deeper or need a hand with any of this, just let me know — I’m here to help!
Sources: Agents 0.x migration guide | LiveKit Docs | Workflows | LiveKit Docs | livekit-agents/livekit/agents/voice/agent.pyrhythmic-printer-24192
08/19/2025, 5:36 PMtall-airline-16756
08/19/2025, 5:36 PMChatContext
when initializing to keep the initial context size in check.
2. Dynamic Context Management: You can access and update the context on the fly during runtime.
3. Override the LLM Node: Another option is to customize the context window logic by overriding the llm_node
method.
Best Practices:
• Try using a sliding window approach to keep the conversation flowing smoothly while managing token usage.
• If your LLM has strict token limits, consider truncating context based on token count.
• Filter out unnecessary content types like images or metadata to keep the context lean.
• Test your setup with real-world examples to make sure your assistant stays helpful and responsive.
If you want me to explain anything further or need more examples, just let me know—I’m here to help!rhythmic-printer-24192
08/19/2025, 5:38 PMtall-airline-16756
08/19/2025, 5:38 PMChatContext
during initialization to keep the initial context size in check.
2. Dynamic Context Management: You can also access and update the context dynamically as your agent runs.
3. Override the LLM Node: For even more control, try overriding the llm_node
method to implement your own context window logic.
Best Practices:
• Using a sliding window approach helps keep the conversation flowing smoothly while managing token usage.
• If your LLM has strict token limits, try truncating context based on token count.
• Filtering out unnecessary content types, like images or metadata, can really help keep the context size optimized.
• Don't forget to test your setup with real-world scenarios to make sure your assistant stays effective and responsive.
If you want me to walk you through any part of this or need more examples, just let me know—I’m happy to help!rhythmic-printer-24192
08/19/2025, 5:39 PMtall-airline-16756
08/19/2025, 5:39 PMfrom livekit.agents import Agent, ChatContext
class SlidingWindowAgent(Agent):
def __init__(self, max_messages=10):
self.max_messages = max_messages
super().__init__(
instructions="You are a helpful assistant.",
# other parameters...
)
def llm_node(self, chat_ctx, tools, model_settings):
# Apply sliding window logic to limit context size
chat_ctx.messages = chat_ctx.messages[-self.max_messages:] # Keep only the last N messages
return Agent.default.llm_node(self, chat_ctx, tools, model_settings)
def _apply_sliding_window(self, chat_ctx):
"""Helper method to apply sliding window logic."""
if len(chat_ctx.messages) > self.max_messages:
chat_ctx.messages = chat_ctx.messages[-self.max_messages:]
return chat_ctx
How It Works:
• The max_messages
parameter sets how many recent messages you want to keep in the context.
• The llm_node
method uses the sliding window logic right before sending the context to the LLM.
• The _apply_sliding_window
helper method makes sure the context stays trimmed to the right size.
This setup helps keep your conversations smooth and on point while making the best use of tokens. If you want me to walk you through it or help with anything else, just let me know!rhythmic-printer-24192
08/19/2025, 5:47 PMtall-airline-16756
08/19/2025, 5:48 PMChatContext
object doesn’t have a messages
attribute. Instead, it uses an items
property, which is a list of ChatItem
objects. These can be ChatMessage
, FunctionCall
, or FunctionCallOutput
types.
If you want to use a sliding window approach, you can take advantage of the truncate
method that ChatContext
provides. Here’s an updated example that might help:
from livekit.agents import Agent, ChatContext
class SlidingWindowAgent(Agent):
def __init__(self, max_items=10):
self.max_items = max_items
super().__init__(
instructions="You are a helpful assistant.",
# other parameters...
)
def llm_node(self, chat_ctx, tools, model_settings):
# Apply sliding window logic to limit context size
chat_ctx.truncate(max_items=self.max_items) # Keep only the last N items
return Agent.default.llm_node(self, chat_ctx, tools, model_settings)
Just a couple of things to keep in mind:
• The truncate
method is made exactly for limiting how many items stay in the context.
• So instead of using messages
, go with items
or just use truncate
to manage your context smoothly.
If anything’s unclear or you want me to walk you through more examples, just let me know—I’m here to help!
Sources: Voice pipeline nodes | LiveKit Docs | agent/main.py | ChatContext | LiveKit Agents | livekit-agents/livekit/agents/llm/chat_context.py | agents/src/llm/chat_context.tsrhythmic-printer-24192
08/19/2025, 5:55 PMtall-airline-16756
08/19/2025, 5:56 PMAgent
class, a great place to start is by checking out the LiveKit Agents source code. Here are some important methods you might want to customize:
• Lifecycle Methods:
◦ `on_enter()`: This one’s called whenever the task starts.
◦ `on_exit()`: Gets triggered when the task finishes.
◦ `on_user_turn_completed(turn_ctx, new_message)`: Happens right after the user finishes speaking and before the LLM responds.
• Pipeline Node Methods:
◦ `stt_node(audio, model_settings)`: Turns audio frames into speech events.
◦ `llm_node(chat_ctx, tools, model_settings)`: Manages text generation using an LLM.
◦ `transcription_node(text, model_settings)`: Wraps up transcription from text segments.
◦ `tts_node(text, model_settings)`: Converts text into audio.
◦ `realtime_audio_output_node(audio, model_settings)`: Handles audio from the realtime LLM session before it plays back.
If you want to dive deeper or see some examples, the LiveKit Agents source code is super helpful and you can find it here:
https://github.com/livekit/agents
Feel free to reach out if you want a hand with any specific method — I’m here to help!
Sources: livekit-agents/livekit/agents/__init__.py