How can I best measure the latency of my llm, stt,...
# ask-ai
a
How can I best measure the latency of my llm, stt, tts for my livekit agent in an easy way
t
Thanks for reaching out, Anthony. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Measuring the latency of LLM, STT, and TTS in your LiveKit agent can feel a bit tricky, but don’t worry—I’m here to guide you through it using the built-in metrics system that emits structured events at each stage of the pipeline. Here’s a friendly breakdown of how you can measure latency for each component: • LLM Latency (Time to First Token): You’ll want to look at the
ttft
metric from the
LLMMetrics
object. This tells you the time from when the LLM request is sent until the very first token is generated. • STT Latency: For non-streaming STT, the
duration
metric shows how long it takes to create the transcript. If you’re working with streaming STT, this value will always be
0
. Also, the
audio_duration
metric gives you the length of the audio input processed. • TTS Latency (Time to First Byte): Here, the
ttfb
metric from the
TTSMetrics
object helps you measure the time it takes for the TTS model to generate the first byte of audio output. To get a fuller picture, you can combine these relevant metrics—like end-of-utterance delay, LLM TTFT, and TTS TTFB—to calculate the total conversation latency. If you’d like to dive deeper or see some hands-on examples, I highly recommend checking out the LiveKit Metrics API documentation and the Python agents examples repository. They’re great resources to help you get comfortable with the implementation. Sources: metrics/metrics_llm.py | README.md | Agent Testing & Evaluation | livekit/livekit_composite | DeepWiki | Capturing metrics | LiveKit Docs | Logs, metrics, and telemetry | LiveKit Docs