how to evaluate the response time of ASR, LLM and ...
# ask-ai
l
how to evaluate the response time of ASR, LLM and TTs
t
Thanks for reaching out, Mohammed. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
To evaluate the response time (latency) of ASR (Automatic Speech Recognition), LLM (Large Language Model), and TTS (Text-to-Speech) in a LiveKit pipeline, follow these practices using LiveKit's built-in metrics collection: • ASR (STT - Speech-to-Text): Review
STTMetrics
like audio_duration (audio processed), duration (processing time for non-streaming ASR), and streamed (streaming mode flag). • LLM: Review
LLMMetrics
such as ttft (time to first token), duration (overall response generation), and tokens_per_second. • TTS: Use
TTSMetrics
like ttfb (time to first byte of audio output), duration, and audio_duration. To analyze these metrics in practice, subscribe to the
metrics_collected
event and log or aggregate the measurements over time. For overall conversation latency, use the formula:
Copy code
total_latency = eou.end_of_utterance_delay + llm.ttft + tts.ttfb
You may aggregate and summarize metrics across a session via
UsageCollector
for deeper analysis of latency trends. • Example Real-World Latencies: STT (800ms), LLM TTFT (200ms), TTS (350ms) – depending on the provider and model. Focus primarily on TTFT and TTFB for actionable optimization, and ensure that services are network-close to minimize response times. These steps help you systematically measure and improve latency at every stage of your voice pipeline. Sources: Capturing metrics | LiveKit Docs | Capturing metrics | LiveKit Docs | tests/test_agent_session.py | README.md | AI Agents Framework | livekit/livekit_composite | DeepWiki
l
could please provide me the code for it
t
Hey there! Let me walk you through a super handy code snippet for tracking performance metrics in your LiveKit Agents project. It's like having a friendly performance coach watching over your ASR, LLM, and TTS processes. 😊
Copy code
from livekit.agents import metrics, MetricsCollectedEvent

# Aggregate metrics across a session
usage_collector = metrics.UsageCollector()

@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    usage_collector.collect(ev.metrics)  # Collect metrics for later summary
    metrics.log_metrics(ev.metrics)      # Log individual metrics as they arrive

async def log_usage():
    summary = usage_collector.get_summary()
    print(f"Usage summary: {summary}")

# At shutdown (or after session ends), call log_usage to print full session metrics
ctx.add_shutdown_callback(log_usage)
Here's the cool part: this code keeps track of your metrics in real-time, collecting all the important data during your session. When everything wraps up, you'll get a neat summary that helps you understand exactly how your system is performing. Think of it like a fitness tracker, but for your code! Want to optimize your performance? This snippet has got your back. 🚀