bulky-actor-93512
09/11/2025, 12:55 AMtall-airline-16756
09/11/2025, 12:55 AMtall-airline-16756
09/11/2025, 12:56 AMbulky-actor-93512
09/11/2025, 12:56 AMtall-airline-16756
09/11/2025, 12:57 AMmetrics
module along with the specific metric types from livekit.agents
.
from livekit.agents import metrics
from livekit.agents.metrics import LLMMetrics, STTMetrics, TTSMetrics, EOUMetrics
2. Subscribe to Metrics Events: Next, use the @agent.on("metrics_collected")
decorator to listen for metrics during your agent session.
@agent.on("metrics_collected")
def on_metrics_collected(mtrcs: metrics.AgentMetrics):
metrics.log_metrics(mtrcs)
3. Access Specific Latency Metrics: Here, you can pull out important latency details like Time to First Token (TTFT), Time to First Byte (TTFB), and End-of-Utterance Delay.
total_latency = eou.end_of_utterance_delay + llm.ttft + tts.ttfb
4. Aggregate Metrics: To get a fuller picture, use the UsageCollector
class to gather metrics across multiple turns.
usage_collector = metrics.UsageCollector()
@agent.on("metrics_collected")
def on_metrics_collected(mtrcs: metrics.AgentMetrics):
usage_collector.collect(mtrcs)
5. Send Metrics to External Tools: Finally, if you want to forward these metrics to a monitoring tool or dashboard, you can do so using requests
.
import requests
async def send_to_dashboard(metric_type: str, data: dict):
await loop.run_in_executor(None, lambda: <http://requests.post|requests.post>(f"{METRICS_SERVER_URL}/metrics/{metric_type}", json=data))
If you’d like to see more detailed examples or how everything fits together, I highly recommend checking out the LiveKit Python Agents Examples repository. It’s a great resource to help you along the way!bulky-actor-93512
09/11/2025, 3:11 AMtall-airline-16756
09/11/2025, 3:12 AMspeech_id
field to connect metrics from STT, LLM, TTS, and EOU components. Here’s a friendly guide on how to do this:
1. Capture Metrics by Turn: Try using event listeners to gather metrics for each component and group them by speech_id
.
class LatencyTrackingAgent:\n def __init__(self):\n self.turn_metrics = {}\n\n def _on_llm_metrics(self, m):\n sid = m.speech_id\n if sid not in self.turn_metrics:\n self.turn_metrics[sid] = {}\n self.turn_metrics[sid][\"llm_ttft\"] = m.ttft\n self.turn_metrics[sid][\"llm_duration\"] = m.duration\n\n def _on_tts_metrics(self, m):\n sid = m.speech_id\n if sid not in self.turn_metrics:\n self.turn_metrics[sid] = {}\n self.turn_metrics[sid][\"tts_ttfb\"] = m.ttfb\n self.turn_metrics[sid][\"tts_duration\"] = m.duration\n\n def _on_eou_metrics(self, m):\n sid = m.speech_id\n if sid not in self.turn_metrics:\n self.turn_metrics[sid] = {}\n self.turn_metrics[sid][\"eou_delay\"] = m.end_of_utterance_delay\n self.turn_metrics[sid][\"transcription_delay\"] = m.transcription_delay
2. Compute End-to-End Latency: Then, you can aggregate the metrics for each turn to find the total latency.
total_latency = eou_delay + transcription_delay + llm_ttft + tts_ttfb
3. Optional Tools: If you're interested, OpenTelemetry can be a great help for distributed tracing, or you might consider forwarding metrics to external systems for visualization.
This method helps keep your latency metrics well-organized and easy to access for each agent turn. If you’d like, I’m here to provide more examples or help you through any part of this!
Sources: tests/test_agent_session.py | metrics/metrics_vad.py | Logs, metrics, and telemetry | LiveKit Docstall-airline-16756
09/11/2025, 3:12 AMspeech_id
field. This way, you can easily correlate metrics from STT, LLM, TTS, and EOU components. Here’s a simple way to do this:
1. Capture Metrics by Turn: You can set up event listeners to gather metrics from each component and group them by speech_id
.
class LatencyTrackingAgent:\n def __init__(self):\n self.turn_metrics = {}\n\n def _on_llm_metrics(self, m):\n sid = m.speech_id\n if sid not in self.turn_metrics:\n self.turn_metrics[sid] = {}\n self.turn_metrics[sid][\"llm_ttft\"] = m.ttft\n self.turn_metrics[sid][\"llm_duration\"] = m.duration\n\n def _on_tts_metrics(self, m):\n sid = m.speech_id\n if sid not in self.turn_metrics:\n self.turn_metrics[sid] = {}\n self.turn_metrics[sid][\"tts_ttfb\"] = m.ttfb\n self.turn_metrics[sid][\"tts_duration\"] = m.duration\n\n def _on_eou_metrics(self, m):\n sid = m.speech_id\n if sid not in self.turn_metrics:\n self.turn_metrics[sid] = {}\n self.turn_metrics[sid][\"eou_delay\"] = m.end_of_utterance_delay\n self.turn_metrics[sid][\"transcription_delay\"] = m.transcription_delay
2. Compute End-to-End Latency: Once you have the metrics, you can add them up for each turn to get the total latency.
total_latency = eou_delay + transcription_delay + llm_ttft + tts_ttfb
3. Log or Export Metrics: Consider using tools like UsageCollector
to aggregate data across sessions, or send these metrics to external systems for easier visualization.
If you want to dive deeper, enabling OpenTelemetry can help you trace the lifecycle of each turn in detail. Feel free to reach out if you'd like me to walk you through that or provide more examples!bulky-actor-93512
09/11/2025, 3:13 AMtall-airline-16756
09/11/2025, 3:13 AMclass LatencyTrackingAgent:
def __init__(self):
self.turn_metrics = {}
def _on_llm_metrics(self, m):
sid = m.speech_id
if sid not in self.turn_metrics:
self.turn_metrics[sid] = {}
self.turn_metrics[sid]["llm_ttft"] = m.ttft
self.turn_metrics[sid]["llm_duration"] = m.duration
print(f"LLM metrics for turn {sid}: TTFT={m.ttft:.4f}s")
def _on_tts_metrics(self, m):
sid = m.speech_id
if sid not in self.turn_metrics:
self.turn_metrics[sid] = {}
self.turn_metrics[sid]["tts_ttfb"] = m.ttfb
self.turn_metrics[sid]["tts_duration"] = m.duration
print(f"TTS metrics for turn {sid}: TTFB={m.ttfb:.4f}s")
def _on_eou_metrics(self, m):
sid = m.speech_id
if sid not in self.turn_metrics:
self.turn_metrics[sid] = {}
self.turn_metrics[sid]["eou_delay"] = m.end_of_utterance_delay
self.turn_metrics[sid]["transcription_delay"] = m.transcription_delay
print(f"EOU metrics for turn {sid}: EOU delay={m.end_of_utterance_delay:.4f}s")
async def entrypoint(self, ctx):
session = AgentSession()
# Set up metric listeners
session.llm.on("metrics_collected", lambda m: asyncio.create_task(self._on_llm_metrics(m)))
session.tts.on("metrics_collected", lambda m: asyncio.create_task(self._on_tts_metrics(m)))
session.stt.on("eou_metrics_collected", lambda m: asyncio.create_task(self._on_eou_metrics(m)))
await session.start(agent=self, room=ctx.room)
This code organizes metrics by speech_id
, ensuring that all latency data for a single agent turn is grouped together. If you have any questions or want me to walk you through any part of this, just let me know—I’m here to help!bulky-actor-93512
09/11/2025, 3:36 AMtall-airline-16756
09/11/2025, 3:36 AMUsageCollector
in LiveKit is thoughtfully designed to gather metrics for the entire session, rather than breaking it down by each agent turn. This means it gives you a helpful cumulative summary at the end, covering important details like LLM tokens, TTS characters, and STT audio durations.
If you're looking to get aggregated data for each agent turn, you might want to create some custom logic to handle and store those metrics turn-by-turn. I'd be more than happy to guide you through setting this up—just let me know!
Sources: Logs, metrics, and telemetry | LiveKit Docs | examples/voice_agents/basic_agent.py | examples/src/cartersia_tts.ts | Capturing metrics | LiveKit Docsbulky-actor-93512
09/11/2025, 3:37 AMtall-airline-16756
09/11/2025, 3:38 AMspeech_id
(which uniquely identifies each turn). Here’s a friendly step-by-step on how you can do this:
1. Initialize a Data Structure: Start by using a dictionary to keep track of metrics for each turn, using speech_id
as the key.
turn_metrics = {}
2. Capture Metrics: Set up listeners to catch metrics events from STT, LLM, TTS, and EOU components. For every event, store the relevant information in your dictionary.
def on_llm_metrics(m):\n sid = m.speech_id\n if sid not in turn_metrics:\n turn_metrics[sid] = {}\n turn_metrics[sid]['llm_ttft'] = m.ttft\n turn_metrics[sid]['llm_duration'] = m.duration
3. Compute Aggregated Latency: Once you have all the metrics for a turn, you can calculate the total latency like this:
total_latency = (turn_metrics[sid]['eou_delay'] +\n turn_metrics[sid]['transcription_delay'] +\n turn_metrics[sid]['llm_ttft'] +\n turn_metrics[sid]['tts_ttfb'])
4. Log or Export Data: After calculating the latency, you can log it or send it off to an external system to visualize the results.
This method will give you a clear, detailed view of latency for each agent turn. If you'd like, I’d be happy to provide a full code example or help you further with this—just let me know!bulky-actor-93512
09/11/2025, 3:41 AMtall-airline-16756
09/11/2025, 3:41 AMclass TurnMetricsAggregator:
def __init__(self):
self.turn_metrics = {}
def collect_metrics(self, metric_type, metric_data):
sid = metric_data.speech_id
if sid not in self.turn_metrics:
self.turn_metrics[sid] = {}
# Store metrics based on their type
if metric_type == 'llm':
self.turn_metrics[sid]['llm_ttft'] = metric_data.ttft
self.turn_metrics[sid]['llm_duration'] = metric_data.duration
elif metric_type == 'tts':
self.turn_metrics[sid]['tts_ttfb'] = metric_data.ttfb
self.turn_metrics[sid]['tts_duration'] = metric_data.duration
elif metric_type == 'eou':
self.turn_metrics[sid]['eou_delay'] = metric_data.end_of_utterance_delay
self.turn_metrics[sid]['transcription_delay'] = metric_data.transcription_delay
def calculate_total_latency(self, sid):
metrics = self.turn_metrics.get(sid, {})
return (
metrics.get('eou_delay', 0) +
metrics.get('transcription_delay', 0) +
metrics.get('llm_ttft', 0) +
metrics.get('tts_ttfb', 0)
)
# Example usage
aggregator = TurnMetricsAggregator()
# Simulate collecting metrics
aggregator.collect_metrics('llm', llm_metric_data)
aggregator.collect_metrics('tts', tts_metric_data)
aggregator.collect_metrics('eou', eou_metric_data)
# Calculate total latency for a specific turn
sid = 'example_speech_id'
total_latency = aggregator.calculate_total_latency(sid)
print(f"Total latency for turn {sid}: {total_latency:.4f}s")
This way, you avoid the hassle of multiple listeners and keep all your logic nicely organized in one place. If you have any questions or want to explore this further, feel free to ask—I’m here to help!bulky-actor-93512
09/11/2025, 3:42 AMtall-airline-16756
09/11/2025, 3:42 AMbulky-actor-93512
09/11/2025, 3:44 AMtall-airline-16756
09/11/2025, 3:45 AMbulky-actor-93512
09/11/2025, 3:45 AMtall-airline-16756
09/11/2025, 3:45 AMclass TranscriptionAggregator:
def __init__(self):
self.transcriptions = {}
def collect_stt_metrics(self, stt_data):
# Group metrics by transcription ID (or another unique identifier)
transcription_id = stt_data.transcription_id
if transcription_id not in self.transcriptions:
self.transcriptions[transcription_id] = {
'audio_duration': 0,
'processing_time': 0,
'word_count': 0
}
# Aggregate metrics for the transcription
self.transcriptions[transcription_id]['audio_duration'] += stt_data.audio_duration
self.transcriptions[transcription_id]['processing_time'] += stt_data.duration
self.transcriptions[transcription_id]['word_count'] += len(stt_data.words)
def get_transcription_summary(self, transcription_id):
# Return aggregated metrics for a specific transcription
return self.transcriptions.get(transcription_id, {})
# Example usage
aggregator = TranscriptionAggregator()
# Simulate collecting STT metrics
aggregator.collect_stt_metrics(stt_data_1)
aggregator.collect_stt_metrics(stt_data_2)
# Get summary for a specific transcription
summary = aggregator.get_transcription_summary('example_transcription_id')
print(summary)
This example assumes each STT metric includes a unique transcription_id
to help group related data together. Feel free to tweak the fields and logic to better fit your specific needs. If you have any questions or want to explore more enhancements, I’m here to help!