many-machine-74335
02/15/2025, 1:54 PMgray-monkey-83525
02/15/2025, 1:54 PMmany-machine-74335
02/15/2025, 1:59 PMgray-monkey-83525
02/15/2025, 1:59 PMWorkerOptions
class to configure your worker. This includes setting the entrypoint_fnc
, which is the function called when a job is assigned to the worker. You can also define other options like request_fnc
for handling job requests, prewarm_fnc
for initialization, and load_fnc
to report system load.
from livekit.agents import WorkerOptions, cli
opts = WorkerOptions(
entrypoint_fnc=entrypoint_function,
request_fnc=request_function,
prewarm_fnc=prewarm_function,
load_fnc=load_function,
load_threshold=0.75,
worker_type=WorkerType.ROOM
)
2. Start the Worker: Use the cli.run_app
method to start the worker with the defined options. This will register the worker with the LiveKit server and make it available to handle jobs.
if __name__ == "__main__":
cli.run_app(opts)
3. Handle Job Assignments: The entrypoint_fnc
is where you define the logic for what the agent should do when it joins a room. This function will be called with a JobContext
object that provides access to the room and participant information.
async def entrypoint_function(ctx: JobContext):
await ctx.connect()
# Your logic here
4. Manage Worker Load: The load_fnc
can be used to report the current system load, which helps in load balancing. The load_threshold
determines when the worker should stop accepting new jobs.
5. Scaling: To handle more connections, deploy additional worker instances. LiveKit automatically balances the load across available workers.
For more details, refer to the Anatomy of an Agent and Deployment and Scaling documentation.
Sources:
- Anatomy of an Agent
- Deployment and Scaling 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.many-machine-74335
02/15/2025, 2:10 PMgray-monkey-83525
02/15/2025, 2:10 PMload_fnc
and load_threshold
in the WorkerOptions
configuration help manage this by indicating when a worker is too busy to accept new jobs.
When a worker reaches its capacity, it will stop accepting new jobs. This is managed by the load_threshold
parameter, which defaults to 0.75 in production mode. If the worker's load exceeds this threshold, it will be marked as unavailable for new jobs, and LiveKit will route new job requests to other available workers. If no other workers are available, the job request will be queued until a worker becomes available.
Would you like more detailed information on configuring worker load management?
Sources:
- Deployment and Scaling 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.many-machine-74335
02/15/2025, 2:14 PMgray-monkey-83525
02/15/2025, 2:14 PMmany-machine-74335
02/15/2025, 2:15 PMgray-monkey-83525
02/15/2025, 2:15 PMapiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: livekit-worker-autoscaler
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: livekit-worker
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
2. Custom Metrics: If you need to scale based on custom metrics, such as the number of active jobs, you can use Prometheus and a custom metric adapter. This involves setting up Prometheus to collect metrics from your LiveKit deployment and configuring the HPA to use these metrics.
3. Deployment Configuration: Ensure your deployment is configured to allow scaling. This includes setting appropriate resource requests and limits for CPU and memory in your pod specifications.
4. Monitor and Adjust: Continuously monitor the performance and adjust the autoscaling parameters as needed to ensure optimal performance and cost efficiency.
For more detailed instructions, refer to the Autoscaling with Helm documentation.
Sources:
- Autoscaling with Helm 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.