I'm having an issue with LiveKit Egress recording ...
# ask-ai
s
I'm having an issue with LiveKit Egress recording where the recording starts successfully but then gets aborted, and the TwirpError is preventing my interview from completing properly. Here's my setup and the problem: Setup: • LiveKit Server, Agent, and Egress running in Docker containers • Agent calls start_room_composite_egress when user grants microphone permission • Agent speaks introduction after starting recording • Interview completes normally, but recording fails Error sequence: 1. Recording starts successfully (no errors from start_room_composite_egress) 2. Agent speaks introduction successfully 3. User participates in interview normally 4. At the end, when trying to stop recording: TwirpError(code=failed_precondition, message=egress with status EGRESS_ABORTED cannot be stopped, status=412) 5. Egress logs show: "Start signal not received" and "Source closed" 6. The TwirpError prevents my interview completion flow from finishing - the function should return "Interview ended successfully" but the error blocks this Questions: 1. What does "Start signal not received" mean exactly, and what conditions cause Egress to abort with this error? 2. How should I handle the TwirpError when trying to stop an already-aborted Egress? Should I catch this specific error and continue with my completion flow, or is there a way to check Egress status before attempting to stop it? 3. Is this a timing issue where Egress expects certain conditions to be met within a specific timeframe after start_room_composite_egress is called? The main issue is that the TwirpError is breaking my interview completion logic, even though it's just trying to clean up an already-failed recording.
t
Thanks for reaching out, Jason. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Thank you for reaching out! Here are some helpful answers to your questions about the LiveKit Egress recording issue: 1. "Start signal not received" Explanation: This message pops up when the egress process doesn’t get the start signal within the expected time. This usually happens if you’ve enabled the
await_start_signal
option but haven’t sent the signal yet through a webhook or API call. Just double-check that this option is set on purpose and that the start signal is sent programmatically when you’re ready to go. 2. Handling TwirpError: The TwirpError with code
FailedPrecondition
can happen if you try to stop an egress that’s already been aborted. To manage this smoothly: ◦ Catch the error and look out for the
FailedPrecondition
code. ◦ You might also want to check the egress status with
ListEgress
or
GetEgress
before stopping it. ◦ In your app’s logic, treat this error as a sign that the stop was successful since the egress is already ended. 1. Timing Requirements: There’s no strict timing to worry about after calling
start_room_composite_egress
. The process runs asynchronously, so feel free to continue with other parts of your application right away. Use the
egress_id
you get back to keep an eye on or manage the egress whenever you need. If anything is unclear or you’d like more help, please don’t hesitate to ask—I’m here to support you! Sources: pkg/service/egress.go | Docs::Home | cmd/lk/egress.go | lib/livekit/proto/livekit_egress_twirp.rb | packages/livekit-server-sdk/src/EgressClient.ts
s
what exactly is the TwirpError?
t
I'd love to help you understand what a TwirpError is! In LiveKit, it's a standardized error type used to manage issues that come up during Remote Procedure Call (RPC) communications via the Twirp protocol. It gives you detailed info about the error, including: • Error Code: This tells you the category of the error, like
INVALID_ARGUMENT
,
NOT_FOUND
, or
FAILED_PRECONDITION
. • Message: A clear explanation of what went wrong. • Optional Metadata: Extra details about the error, such as HTTP status codes. For instance, if you try to stop an Egress that’s already been aborted, you might see a TwirpError with the code
FAILED_PRECONDITION
and a message saying "egress with status EGRESS_ABORTED cannot be stopped." This really helps you pinpoint and tackle specific issues in your application. If you’d like me to walk you through this further or have any questions, just let me know—I’m here to help! Sources: livekit-api/livekit/api/__init__.py | packages/livekit-server-sdk/src/TwirpRPC.ts | SIP troubleshooting guide | LiveKit Docs | livekit/livekit_ingress.twirp.go | livekit-api/src/services/mod.rs
s
actually we just looked at the egress logs and saw this: Perfect! The Egress logs tell us a very clear story: 🎯 What the Egress Logs Show: Timeline: 1. 040852 - Egress starts up, connects to Redis successfully 2. 041013 - Request received for EG_RenXbPK7DWqY 3. 041013 - Request validated (shows correct room, file path, etc.) 4. 041013 - Egress connects to Redis for this specific recording 5. 041014 - Only 1 second later! egress_aborted 6. 041014 - Chrome reports END_RECORDING 🔍 Key Insights: 1. The recording aborted after only ~1 second (041013 → 041014) 2. "Source closed" - This is the key detail! 3. Chrome immediately ended recording - suggests the web source disconnected 🎯 The Real Problem: The issue isn't TwirpErrors or connectivity between LiveKit components. The issue is that the web source (Chrome browser) closed the connection almost immediately after the recording started. This suggests: • Browser/WebRTC connection issue • User's browser lost connection to the LiveKit room • Frontend JavaScript problem causing immediate disconnection • Network issue on the user's side The "Start signal not received" means Egress was waiting for audio/video streams from the browser, but the browser disconnected before sending any content. Question for LiveKit should focus on: What causes "Source closed" immediately after recording starts, and how to debug browser/WebRTC connection issues that cause immediate disconnections. help!
t
Thanks for reaching out! The exact phrase "Source closed" isn’t directly documented, but it usually means that the input media source (like your browser or a WebRTC connection) got disconnected or became unavailable. To help you better, could you please clarify a couple of things? • Are you using room composite egress, track egress, or full room egress? • Could you share a bit more detail or maybe a snippet of the log that shows the full context of the "Source closed" message? With this info, we’ll be in a much better position to figure out what’s going on and guide you through the next steps. Looking forward to your reply! Sources: Diagnosing Connection Errors with Connection Test Utility | Chrome Developer Extension | livekit/livekit_composite | DeepWiki | Knowledge Management System | livekit/livekit_composite | DeepWiki | src/api/SignalClient.ts | pkg/pipeline/watch.go
s
Thanks for the quick response! Here are the details: 1. Egress Type: We're using room composite egress with audio_only: true and file_outputs to save as MP4. 2. Full Log Context: Here's the complete Egress log showing the "Source closed" message: text Apply to livekit_inte... 2025-09-11T041013.712Z INFO egress server/server_rpc.go:59 request received {"nodeID": "NE_h3ARJRrwkRpi", "clusterID": "", "egressID": "EG_RenXbPK7DWqY"} 2025-09-11T041013.712Z INFO egress server/server_rpc.go:69 request validated {"nodeID": "NE_h3ARJRrwkRpi", "clusterID": "", "egressID": "EG_RenXbPK7DWqY", "requestType": "room_composite", "sourceTy pe": "EGRESS_SOURCE_TYPE_WEB", "outputType": "file", "room": "listenful_interview_45_lpvf1p14", "request": { "RoomComposite":{"room_name":"listenful_interview_45_lpvf1p14","audio_only":true,"Output":null,"Options":nul l,"file_outputs":[{"file_type":1,"filepath":"/uploads/interview_listenful_interview_45_lpvf1p14_20250910_211 013.mp4","Output":null}]}}} 2025-09-11T041014.245Z INFO egress info/io.go:230 egress_aborted {"nodeID": "NE_h3ARJRrwkRpi", "clusterID": "", "egressID": "EG_RenXbPK7DWqY", "requestType": "room_composite", "outputType": "file", "error": "Start signal not received", "code": 412, "details": "End reason: Source closed"} 2025-09-11T041014.221Z INFO egress source/web.go:320 chrome: END_RECORDING {"nodeID": "NE_h3ARJRrwkRpi", "handlerID": "EGH_FPWHnWf68HXK", "clusterID": "", "egressID": "EG_RenXbPK7DWqY"} Key observations: • Recording request received and validated successfully at 041013.712Z • Aborted just ~0.5 seconds later at 041014.245Z with "Start signal not received" + "Source closed" • Chrome immediately reports END_RECORDING 3. Additional Context: • The user granted microphone permissions and the interview proceeded normally from their perspective • Our agent successfully spoke the introduction and the user participated in the full interview • But the recording failed almost immediately after starting • When we try to stop the recording at the end: TwirpError(code=failed_precondition, message=egress with status EGRESS_ABORTED cannot be stopped, status=412) Question: What would cause the web source to close immediately after a successful room composite egress request, and how can we debug this browser/WebRTC connection issue?
t
Let’s work together to troubleshoot the "Source closed" and "END_RECORDING" issues you’re experiencing with your room composite egress setup. Here’s a friendly step-by-step guide to help you out: 1. Enable Chrome Debug Logging: Try adding
enableChromeLogging: true
to your egress configuration. This will generate a
chrome.log
file that captures the browser console output, including important signals like
START_RECORDING
and
END_RECORDING
. It’s a great way to get more insight into what’s happening behind the scenes. 2. Verify START_RECORDING and END_RECORDING Signals: Make sure your template or frontend is logging
console.log('START_RECORDING')
when recording begins and
console.log('END_RECORDING')
when it ends. If you’re using the default EgressHelper SDK, it should handle this for you automatically, but it’s good to double-check! 3. Check Chrome Startup and Navigation: Confirm that Chrome is starting up properly with all the necessary dependencies like Xvfb and PulseAudio. Keep an eye out in the egress logs for any errors such as "chrome failed to start" or "navigation timed out"—these clues can be really helpful. 4. Handle Room Disconnection: It’s important to ensure that the egress participant stays connected to the room and that network issues aren’t causing it to disconnect too soon. The recording should keep going even if users leave the room, so let’s make sure that’s happening as expected. 5. Review Egress Lifecycle: Remember, the egress process relies on Chrome rendering the room and logging the right signals. If Chrome closes unexpectedly, take a look at the
chrome.log
file for any JavaScript errors or exceptions that might give us a hint. I hope these steps help you get to the bottom of why the source is closing prematurely. Please don’t hesitate to reach out if you need any more help—I’m here for you! Sources: pkg/pipeline/controller.go | Custom recording templates | LiveKit Docs | pkg/pipeline/source/sdk.go | Custom recording templates | LiveKit Docs | Docs::Home