calm-book-72252
07/12/2025, 5:44 AMtall-airline-16756
07/12/2025, 5:44 AMtall-airline-16756
07/12/2025, 5:46 AMcalm-book-72252
07/12/2025, 5:50 AMtall-airline-16756
07/12/2025, 5:51 AMChatContext
using add_message
, and then get the session rolling with this context.
async def entrypoint(ctx: JobContext):
initial_ctx = ChatContext()
initial_ctx.add_message(
role="user",
content=[
"Here is a picture of me",
ImageContent(image="<https://example.com/image.jpg>")
],
)
await session.start(
room=ctx.room,
agent=Agent(chat_ctx=initial_ctx),
)
Pro tip: If you've got a local file, you can totally convert it to a base64 data URL instead of using a web link.
• Provider compatibility: Just a friendly reminder to double-check that your LLM provider is cool with the type of file you're sending (like images).
Oh, and if you're expecting files to pop up after the agent joins, no stress! You can also set up a byte stream handler to process files as they come into the room.
Sources: Vision | LiveKit Docs | examples/voice_agents/multi_agent.py | examples/data-streams/index.tscalm-book-72252
07/12/2025, 5:52 AMtall-airline-16756
07/12/2025, 5:53 AMsendFile
method right in the frontend to upload your PDF when the session starts. On the agent side, they'll just need to set up a byte stream handler to receive and process the PDF data.
• PDF processing: Once you've got the PDF, you've got a couple of cool options:
◦ Convert to text (which works great for most LLMs): Just use a PDF parsing library like PDF.js to extract the text, and then add that text to the LLM's context.
◦ Convert to images (perfect for vision-capable LLMs): Simply convert each PDF page to an image (like PNG), and then add those images as ImageContent
to the LLM's context.
Pro tip: Text conversion usually works best for standard LLMs, while vision LLMs are awesome at processing images. Just keep in mind there's a 500MB file size limit for uploads, and it's always smart to double-check your LLM provider's documentation for their specific input requirements.
Need a specific code example? Just let me know what workflow or LLM provider you're working with, and I'll be happy to help!
Sources: Sending files & bytes | LiveKit Docs | Vision | LiveKit Docs | tools/server/webui/src/components/useChatExtraContext.tsx | livekit-agents/livekit/agents/llm/chat_context.py