i can see that workers expose a health check how would i uti LiveKit Community #ask-ai

Join Slack

i can see that workers expose a health check, how ...

# ask-ai

important-judge-78504

07/10/2025, 3:40 PM

i can see that workers expose a health check, how would i utilize that in an ECS setup?

tall-airline-16756

07/10/2025, 3:40 PM

Thanks for reaching out, Ed. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/10/2025, 3:41 PM

LiveKit workers expose an HTTP health check endpoint, typically available on

<http://0.0.0.0:8081/>

. To utilize this within an AWS ECS setup, you can configure your ECS task definition to use this endpoint for container health monitoring. • Add a HealthCheck block in your ECS container definition that uses a

curl

command to test the health check endpoint:

Copy code

HealthCheck:
  Command:
    - CMD-SHELL
    - "curl -f <http://localhost:8081/> || exit 1"
  Interval: 30
  Timeout: 5
  Retries: 3
  StartPeriod: 60

• Expose port 8081 in your

PortMappings

to ensure ECS can perform the health check. • The endpoint returns "OK" if the worker is healthy, and ECS will automatically restart the container if the check fails. This makes your ECS deployment robust by letting ECS track container health and replace unhealthy workers automatically. You can adjust health check intervals and retries as needed for your use case. Let me know if you need a sample CloudFormation snippet or any further implementation detail. References: Sources: livekit-agents/livekit/agents/worker.py | Deploying to production | LiveKit Docs

important-judge-78504

07/10/2025, 3:51 PM

is it possible to make the livekit worker reject call requests by using custom metrics? like what if i wanted to include ECS task memory utilization as a metric that the worker would use to deny or accept calls

tall-airline-16756

07/10/2025, 3:52 PM

Hey there! I totally get what you're looking for, and the good news is LiveKit has a super flexible way to help your workers manage their job load. Imagine being able to tell your workers, "Hey, only take on new tasks if you're not too swamped!" That's exactly what the custom

load_fnc

lets you do. Here's the cool part: you can create a function that checks things like your ECS task's memory usage. Think of it like a friendly worker raising their hand and saying, "I'm getting a bit overwhelmed right now." The function returns a number between 0 and 1, and when it hits your set threshold (like 0.8), the worker politely declines new jobs. You could pull in data from AWS CloudWatch, mix in memory and CPU stats, or even add your own special sauce to calculate load. It's perfect for cloud environments where keeping an eye on resources is key. Your workers will thank you for helping them manage their workload! 😊 References: Sources: Worker options | LiveKit Docs | Deploying to production | LiveKit Docs | Self-hosting the Egress Service | LiveKit Docs

important-judge-78504

07/10/2025, 3:53 PM

would we want to use request_fnc or load_fnc?

tall-airline-16756

07/10/2025, 3:54 PM

Hey there! Let me help you understand the best way to manage job acceptance. You'll want to use *`load_fnc`*—not `request_fnc`—when you need to control job acceptance based on external resource metrics like ECS memory usage. • Think of load_fnc
as your system's smart assistant. It checks in periodically and gives you a score between 0 and 1 that shows how busy your worker is. If that score goes over your set

load_threshold

, the worker automatically takes a break from new jobs. It's perfect for keeping an eye on things like memory, CPU, or any custom resource in near real-time. • On the other hand,

request_fnc

is more of a job-by-job helper. It can look at specific job details, but it's not great for keeping constant tabs on your system's overall health. So, if you're tracking ECS memory or similar metrics, set up your check in

load_fnc

. This way, your worker can proactively manage its availability and play nicely with LiveKit's job routing system. References: Sources: Deployment and scaling | LiveKit Docs | Worker options | LiveKit Docs

important-judge-78504

07/11/2025, 6:21 PM

is there a recommend timeout or retry config for these health checks? they keep failing in an ecs service on deploy but i’m thinking that might be because the workers take a little while to start up,

tall-airline-16756

07/11/2025, 6:22 PM

I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!

Open in Slack

Previous Next