How do workers work with self-hosted agents? Does ...
# ask-ai
c
How do workers work with self-hosted agents? Does one worker/instance only handle one call at a time?
t
Thanks for reaching out, Anton. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Let me walk you through how workers in LiveKit operate, especially when used with self-hosted agents. Think of workers as helpful intermediaries that register with the LiveKit server and patiently wait for job assignments. When a room is created, the server reaches out to an available worker, who then has the choice to accept or decline the job. Once accepted, the worker brings the agent into the room as a participant to handle tasks like subscribing to tracks. It's important to know that a single worker instance is not limited to handling just one call at a time. It can actually manage multiple agent instances all at once, with each agent running in its own separate process. This setup helps make the most of your resources and allows for smooth scalability, depending on your machine’s capacity. Plus, since each job runs in its own process, if one agent happens to crash, it won’t impact the others sharing the same worker. The LiveKit server smartly uses a job distribution system that assigns each job to only one worker, following a fair round-robin approach. Workers keep the server updated about their availability and current load. And when things get busy, they can pause accepting new jobs to keep everything running smoothly without performance hiccups. Sources: agents/src/worker.ts | Anatomy of an Agent | LiveKit Docs | Job lifecycle | LiveKit Docs | Deploying to custom environments | LiveKit Docs
c
What about cloud-hosted agents?
t
Let me walk you through how cloud-hosted agents work in LiveKit. Think of workers as friendly helpers who, just like self-hosted ones, register with the LiveKit server and patiently wait for tasks. When a room is created, the server finds an available worker and offers it a job, which the worker can choose to accept or decline. Once accepted, the worker brings an agent to life as a participant, ready to interact with tracks and other participants. It's great to know that a single worker isn’t limited to just one call at a time. It can gracefully manage multiple agent instances all at once, with each running in its own safe little subprocess. This clever setup helps use resources wisely and keeps things stable—so if one instance runs into trouble, the others keep going smoothly. In LiveKit Cloud, there’s a thoughtful system called geographic affinity that routes jobs to workers closest to the end user, helping to keep latency low and the experience snappy. Plus, the platform takes care of scaling, load balancing, and resource allocation automatically, so agents can flexibly grow and shrink based on what’s needed. And when it’s time for a worker to shut down, it does so gently—stopping new job acceptance but continuing to handle any ongoing sessions until they finish, making sure no calls get interrupted. We truly care about keeping everything running smoothly for you! Sources: Deployment and scaling | LiveKit Docs | Deploy and scale agents on LiveKit Cloud | Anatomy of an Agent | LiveKit Docs | README.md | Worker lifecycle | LiveKit Docs