bright-daybreak-66077
07/08/2025, 2:57 PMtall-airline-16756
07/08/2025, 2:57 PMtall-airline-16756
07/08/2025, 2:57 PMRoomDisconnectedEvent
to identify why the disconnection occurred, since reasons could include network issues or specific server errors.
2. Handle exceptions gracefully: Some exceptions like ThrottlingException
or ModelTimeoutException
might require restarting the session in order to clear the stale state.
3. Review session and participant management settings: Verify configuration parameters such as close_on_disconnect
that impact session closures to ensure no unexpected session persistence.
4. Verify Redis and session tracking: Ensure that Redis (if used) is properly managing session states to avoid blocking new session creation after a disconnect.
Reviewing logs associated with the disconnect event and these configurations will help systematically diagnose and resolve the issue.
Sources: RoomDisconnectedEvent constructor - RoomDisconnectedEvent - livekit_client ... | room(_:didDisconnectWithError:) | Documentation | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/real... | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | RoomEvent | Disconnectedbright-daybreak-66077
07/08/2025, 3:30 PMasync def on_enter(self):
"""When the agent enters, get pipeline ID from job context"""
# Get pipeline_id from job context with retry logic
metadata = await self._get_room_metadata_with_retry()
pipeline_id = metadata.get("pipeline_id")
if not pipeline_id:
logger.error(f"Pipeline ID not found in room metadata: {metadata}")
await self.session.say(
"Sorry, there was a configuration error. Please try again."
)
return
<http://logger.info|logger.info>(f"Agent entered room with pipeline ID: {pipeline_id}")
def prewarm(proc: JobProcess):
"""Prewarm function to initialize models"""
proc.userdata["vad"] = silero.VAD.load()
async def entrypoint(ctx: JobContext):
"""Main entry point for the agent"""
# Configure logging once at application entry point
log_level = os.getenv("LOG_LEVEL", "INFO").upper()
logging.basicConfig(
level=getattr(logging, log_level, <http://logging.INFO|logging.INFO>),
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
handlers=[logging.StreamHandler()],
)
# Enable debug logging for LiveKit components
logging.getLogger("livekit").setLevel(logging.DEBUG)
logging.getLogger("livekit.agents").setLevel(logging.DEBUG)
logging.getLogger("livekit.plugins").setLevel(logging.DEBUG)
# Reduce noise from third-party libraries
logging.getLogger("httpcore").setLevel(logging.WARNING)
logging.getLogger("httpx").setLevel(logging.WARNING)
logging.getLogger("urllib3").setLevel(logging.WARNING)
logging.getLogger("openai._base_client").setLevel(logging.WARNING)
ctx.log_context_fields = {
"room": ctx.room.name,
}
# Validate required environment variables
required_vars = ["OPENAI_API_KEY"]
for var in required_vars:
if not os.getenv(var):
raise ValueError(f"{var} environment variable is required")
# Get OpenAI configuration from environment variables (only for STT/TTS, not LLM)
openai_key = os.getenv("OPENAI_API_KEY")
stt_model = os.getenv("STT_MODEL", "gpt-4o-transcribe")
tts_model = os.getenv("TTS_MODEL", "tts-1")
tts_voice = os.getenv("TTS_VOICE", "shimmer")
session = AgentSession(
vad=ctx.proc.userdata["vad"],
# LLM needed for session but actual processing happens in our custom llm_node
llm=openai.LLM(model="gpt-4o-mini", api_key=openai_key),
stt=openai.STT(model=stt_model, api_key=openai_key),
tts=openai.TTS(model=tts_model, voice=tts_voice, api_key=openai_key),
turn_detection=MultilingualModel(),
)
# Log metrics
usage_collector = metrics.UsageCollector()
@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
metrics.log_metrics(ev.metrics)
usage_collector.collect(ev.metrics)
async def log_usage():
summary = usage_collector.get_summary()
<http://logger.info|logger.info>(f"Usage: {summary}")
ctx.add_shutdown_callback(log_usage)
# ERROR HANDLING EVENTS - Track session close events for debugging
@session.on("close")
def on_close(event):
room_name = ctx.room.name
if event.error:
logger.error(
f"=== SESSION CLOSED WITH ERROR === Room: {room_name} | Error: {event.error} | Error type: {type(event.error).__name__} | Reason: {event.reason}"
)
else:
<http://logger.info|logger.info>(
f"=== SESSION CLOSED NORMALLY === Room: {room_name} | Reason: {event.reason}"
)
@session.on("error")
def on_error(event):
room_name = ctx.room.name
recoverable = getattr(event.error, "recoverable", "unknown")
source = getattr(event, "source", "unknown")
logger.error(
f"=== SESSION ERROR === Room: {room_name} | Error: {event.error} | Source: {source} | Recoverable: {recoverable}"
)
await session.start(agent=AiriaVoiceProxyAgent(), room=ctx.room)
await ctx.connect()
if __name__ == "__main__":
cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint, prewarm_fnc=prewarm))
tall-airline-16756
07/08/2025, 3:30 PMr.deleteRoom()
doesn't work smoothly, that room could hang around and prevent new sessions from starting. No worries, we can fix this!
• Incomplete rollback on errors: Sometimes things can get a bit messy if an error pops up midway through setting up a session. Your room might end up stuck in the active rooms map, creating a bit of a roadblock.
• Race conditions during session creation: Even with locking in place, those tricky concurrent sessions can sometimes cause unexpected hiccups.
Here are some friendly suggestions to help you smooth things out:
1. Add some extra care to roll back room and session creation if something goes wrong.
2. Make sure your async cleanup is thorough and catches any potential errors.
3. Do a careful walkthrough of your session lifecycle to ensure everything gets cleaned up properly.
I'm confident that by working through these steps, you'll be able to track down and resolve those persistent sessions. You've got this!
Sources: pkg/service/roommanager.go