how do i add metrics in this script: from livekit ...
# ask-ai
s
how do i add metrics in this script: from livekit import agents from livekit.agents import AgentSession, Agent, RoomInputOptions, RoomOutputOptions, metrics, MetricsCollectedEvent from livekit.plugins import ( openai, cartesia, deepgram, noise_cancellation, silero, ) from livekit.plugins.turn_detector.multilingual import MultilingualModel from livekit.agents import UserInputTranscribedEvent from dotenv import load_dotenv load_dotenv(dotenv_path=".env.local") usage_collector = metrics.UsageCollector() class Translator(Agent): def __init__(self) -> None: super().__init__( instructions="Translate English speech to Hindi only, say nothing else" ) async def entrypoint(ctx: agents.JobContext): session = AgentSession( stt=deepgram.STT(), llm=openai.LLM(model="gpt-4o-mini"), tts=cartesia.TTS( model="sonic-2", voice="28ca2041-5dda-42df-8123-f58ea9c3da00", language="hi" ), vad=silero.VAD.load(), turn_detection=MultilingualModel(), use_tts_aligned_transcript=True, # enables TTS-aligned transcription for realtime sync ) _# @session.event("conversation_item_added")_ _# async def on_conversation_item_added(item):_ # print(f"[BACKEND] Conversation item added: {item}") @session.on("user_input_transcribed") def on_user_input_transcribed(event: UserInputTranscribedEvent): print(f"User input transcribed: {event.transcript}, final: {event.is_final}, speaker id: {event.speaker_id}") await session.start( room=ctx.room, agent=Translator(), room_input_options=RoomInputOptions( noise_cancellation=noise_cancellation.BVC(), ), room_output_options=RoomOutputOptions( transcription_enabled=True, # ensure STT transcriptions are forwarded sync_transcription=True # word-by-word sync with speech ), ) await ctx.connect() await session.generate_reply( instructions="Greet 'Hello I'm a translator'" ) if name == "__main__": agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))
t
Thanks for reaching out, Nakshatra. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
To add metrics to your script, you can use the
UsageCollector
for aggregated metrics or create custom metrics for specific data points. Here’s a friendly guide on how to get that set up: • Start by initializing the
UsageCollector
to gather your metrics: • Then, collect metrics as your session runs: • When it’s time to wrap up, log a summary at shutdown: • If you want to track something more specific, define a custom metric and emit it like this: • And don’t forget to handle those custom metrics in an event listener: This way, you can easily keep track of and log your metrics during the session. If you want to dive deeper or have any questions, just let me know—I’m here to help!
s
does the log_usage and @session.on go in the entrypoint?
t
Yes, both the log_usage function and the @session.on event handlers should live inside the entrypoint function. Here’s a friendly way to set it up: • Pop the
log_usage
function inside the
entrypoint
and register it as a shutdown callback like this: • Make sure to include the
@session.on
event handlers within the
entrypoint
so they’re properly connected to the session: This way, your metrics get collected and logged smoothly throughout the session. If anything’s unclear or you want to dive deeper, just let me know—I’m here to help!
s
LiveKit worker process is timing out during initialization.
t
If your LiveKit worker process is timing out during initialization, no worries! Here are a few friendly tips to help you sort it out: • Increasing the initialization timeout: You can give your process a bit more time to get going by adjusting the
initializeProcessTimeout
value in your
WorkerOptions
. Here’s a quick example: • Implementing error handling: It’s a good idea to catch timeout exceptions so your app can handle them smoothly. Like this: • Monitoring system resources: Make sure your system has enough CPU and memory to handle the initialization without a hitch. • Checking logs: Take a peek at the timeout and initialization failure logs—they can give you clues about what’s going on behind the scenes. If you want to chat more about this or need a hand with any of these steps, just let me know! I’m here to help.
s
how do i adjust the procresstimeout in workeroptions?
t
To tweak the procresstimeout in WorkerOptions, you just need to set the initialize_process_timeout parameter when you create the WorkerOptions object. Here’s a quick example:
Copy code
from livekit.agents import WorkerOptions

# Adjust the process timeout
worker_options = WorkerOptions(
    initialize_process_timeout=30.0  # Set timeout to 30 seconds
)
This will bump the timeout for process initialization up to 30 seconds. Feel free to change the value to whatever works best for you. If you want me to walk you through it or have any questions, just let me know!
s
it goes in the entrypoint as well
?
t
Yes, you’ll want to include the adjustment of initialize_process_timeout in WorkerOptions right inside the entrypoint function. This way, the timeout setting gets applied properly when the worker processes start up. Here’s a quick example for you:
Copy code
async def entrypoint(ctx: agents.JobContext):
    worker_options = WorkerOptions(
        initialize_process_timeout=30.0  # Set timeout to 30 seconds
    )

    # Use worker_options when initializing your process pool or other components
    pool = ProcPool(
        initialize_process_fnc=your_init_func,
        job_entrypoint_fnc=your_job_func,
        num_idle_processes=your_num_processes,
        initialize_timeout=worker_options.initialize_process_timeout
    )

    await pool.start()
If anything’s unclear or you want to dive deeper, just let me know—I’m here to help!
s
how do I add latency in this script
t
To add latency tracking to your script, you can totally use LiveKit's metrics features. Here’s a friendly guide on how to get that set up: • Using the MetricsCollector: Start by initializing the metrics collector and then record those latency metrics like this: `from livekit.agents import metrics metrics_collector = metrics.UsageCollector() # Record latency metrics_collector.record_latency("llm_response_time", response_time_ms)`Just swap out
response_time_ms
with the actual latency value you have in milliseconds. • Capturing Conversation Latency: You can combine different metrics like end-of-utterance delay, time to first token, and time to first byte to get a full picture: `total_latency = eou.end_of_utterance_delay + llm.ttft + tts.ttfb`This way, you get a nice comprehensive view of the conversation latency. • Custom Metrics: If you want to get a bit more detailed, custom metrics are your friend:
Copy code
# Increment a counter
metrics_collector.record_counter("custom_latency_counter", 1)

# Track dynamic values
metrics_collector.record_gauge("custom_latency_gauge", latency_value)
If you want me to walk you through any part of this or need some specific examples for your script, just let me know—I’m here to help! Sources: pkg/telemetry/prometheus/packets.go | Logs, metrics, and telemetry | LiveKit Docs | Capturing metrics | LiveKit Docs
s
just need to add conversational latency in this: async def entrypoint(ctx: agents.JobContext): _# worker_options= WorkerOptions(initialize_process_timeout=30)_ session = AgentSession( stt=deepgram.STT(), llm=openai.LLM(model="gpt-4o-mini"), tts=cartesia.TTS( model="sonic-2", voice="28ca2041-5dda-42df-8123-f58ea9c3da00", language="hi" ), vad=silero.VAD.load(), turn_detection=MultilingualModel(), use_tts_aligned_transcript=True, # enables TTS-aligned transcription for realtime sync ) _# @session.event("conversation_item_added")_ _# async def on_conversation_item_added(item):_ # print(f"[BACKEND] Conversation item added: {item}") @session.on("user_input_transcribed") def on_user_input_transcribed(event: UserInputTranscribedEvent): print(f"User input transcribed: {event.transcript}, final: {event.is_final}, speaker id: {event.speaker_id}") await session.start( room=ctx.room, agent=Translator(), room_input_options=RoomInputOptions( noise_cancellation=noise_cancellation.BVC(), ), room_output_options=RoomOutputOptions( transcription_enabled=True, # ensure STT transcriptions are forwarded sync_transcription=True # word-by-word sync with speech ), ) await ctx.connect() await session.generate_reply( instructions="Greet 'Hello I'm a translator'" ) @session.on("metrics_collected") def _on_metrics_collected(ev: MetricsCollectedEvent): usage_collector.collect(ev.metrics) async def log_usage(): summary = usage_collector.get_summary() logger.info(f"Usage: {summary}\n") logger.info("\nLatency: \n") ctx.add_shutdown_callback(log_usage) def prewarm(proc: agents.JobProcess): proc.userdata["vad"] = silero.VAD.load() if name == "__main__": agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint, prewarm_fnc=prewarm))
t
To add conversational latency tracking to your script, you can calculate and log the latency by combining metrics like end-of-utterance delay, time to first token, and time to first byte. Here’s a friendly walkthrough on how to fit it into your existing code:
Copy code
async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        stt=deepgram.STT(),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=cartesia.TTS(
            model="sonic-2",
            voice="28ca2041-5dda-42df-8123-f58ea9c3da00",
            language="hi"
        ),
        vad=silero.VAD.load(),
        turn_detection=MultilingualModel(),
        use_tts_aligned_transcript=True
    )

    @session.on("metrics_collected")
    def _on_metrics_collected(ev: MetricsCollectedEvent):
        usage_collector.collect(ev.metrics)

    async def log_usage():
        summary = usage_collector.get_summary()
        <http://logger.info|logger.info>(f"Usage: {summary}")

        # Calculate conversational latency
        eou_delay = summary.get("eou.end_of_utterance_delay", 0)
        llm_ttft = summary.get("llm.ttft", 0)
        tts_ttfb = summary.get("tts.ttfb", 0)
        total_latency = eou_delay + llm_ttft + tts_ttfb

        <http://logger.info|logger.info>(f"Latency: {total_latency} ms")

    ctx.add_shutdown_callback(log_usage)

    await session.start(
        room=ctx.room,
        agent=Translator(),
        room_input_options=RoomInputOptions(
            noise_cancellation=noise_cancellation.BVC(),
        ),
        room_output_options=RoomOutputOptions(
            transcription_enabled=True,
            sync_transcription=True
        ),
    )

    await ctx.connect()

    await session.generate_reply(
        instructions="Greet 'Hello I'm a translator'"
    )
This setup will neatly log the total conversational latency when the session shuts down. If you want to tweak it or have any questions, just let me know—I’m here to help!
s
which openai llm models does livekit support?
t
LiveKit supports a bunch of OpenAI LLM models through its plugin, including: • GPT-3.5 Models: Like
gpt-3.5-turbo
,
gpt-3.5-turbo-16k
, and some others with specific date-stamped versions. • GPT-4 Models: Including
gpt-4
,
gpt-4-turbo
,
gpt-4-vision-preview
, plus their date-stamped versions. • GPT-4o Models: Such as
gpt-4o
,
gpt-4o-mini
, and their date-specific versions. If you want to chat more about which model might be best for you or need any extra info, just let me know! I'm here to help. Sources: plugins/openai/src/llm.ts | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | LLM Integrations | livekit/livekit_composite | DeepWiki | OpenAI LLM integration guide | LiveKit Docs
s
which is the lightest and uses the least credits?
t
The lightest OpenAI LLM model supported by LiveKit that uses the least credits is gpt-4o-mini. This model is designed to be smaller and more efficient, so it’s a great, budget-friendly option if you’re looking to save on resources. If you’d like, I’m here to help you set it up in your script—just let me know! Sources: OpenAI LLM integration guide | LiveKit Docs
s
async def log_usage(): summary = usage_collector.get_summary() logger.info(f"Usage: {summary}") # Calculate conversational latency eou_delay = summary.get("eou.end_of_utterance_delay", 0) llm_ttft = summary.get("llm.ttft", 0) tts_ttfb = summary.get("tts.ttfb", 0) total_latency = eou_delay + llm_ttft + tts_ttfb logger.info(f"Latency: {total_latency} ms") ctx.add_shutdown_callback(log_usage) this is not printing the total latency
t
If the total latency isn’t showing up, it might be because some metrics are missing or not being collected properly in the summary. Make sure that eou.end_of_utterance_delay, llm.ttft, and tts.ttfb are all being gathered and included in the summary. Here’s a quick way to check and tweak your log_usage function:
Copy code
async def log_usage():
    summary = usage_collector.get_summary()
    <http://logger.info|logger.info>(f"Usage: {summary}")

    # Verify and calculate total latency
    eou_delay = summary.get("eou.end_of_utterance_delay", 0)
    llm_ttft = summary.get("llm.ttft", 0)
    tts_ttfb = summary.get("tts.ttfb", 0)
    total_latency = eou_delay + llm_ttft + tts_ttfb

    <http://logger.info|logger.info>(f"Latency: {total_latency} ms")
If you don’t see these metrics in the summary, double-check that they’re being collected during the session. Feel free to reach out if you want me to help you dig a little deeper! Sources: Logs, metrics, and telemetry | LiveKit Docs
s
this is what i am getting Usage: UsageSummary(llm_prompt_tokens=113, llm_prompt_cached_tokens=0, llm_completion_tokens=23, tts_characters_count=66, tts_audio_duration=4.6903749999999995, stt_audio_duration=35.049999999999926)
t
The UsageSummary you shared doesn’t include the metrics eou.end_of_utterance_delay, llm.ttft, or tts.ttfb, which we need to calculate the total latency. It looks like these metrics aren’t being collected or included in the summary right now. Here’s how we can fix this: 1. Ensure Metrics Collection: Double-check that your session is set up to collect these specific metrics. For example:
Copy code
@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    usage_collector.collect(ev.metrics)
2. Check Metric Availability: Make sure the metrics eou.end_of_utterance_delay, llm.ttft, and tts.ttfb are actually being emitted during the session. If they’re missing, it might help to confirm that the relevant parts (like STT, LLM, TTS) are configured to send these metrics. 3. Debug Missing Metrics: If you still don’t see these metrics, try logging the raw metrics inside the _on_metrics_collected handler to see what’s coming through. Feel free to reach out if you want a hand checking or setting this up!
s
how do i check metric availability
t
To check if those metrics are available, you can log the raw metrics collected during your session. Here’s a simple way to do it: 1. Log Metrics in the Event Handler: Update the _on_metrics_collected handler to print out all the metrics it receives:
Copy code
@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    print(f"Raw metrics collected: {ev.metrics}")
    usage_collector.collect(ev.metrics)
2. Take a Look at the Logged Metrics: Run your script and check the output to see if the metrics eou.end_of_utterance_delay, llm.ttft, and tts.ttfb show up. 3. Double-Check Your Component Setup: Make sure the components you’re using (like STT, LLM, TTS) are set up to send these metrics. For instance, verify that your LLM and TTS models support metrics collection. If you hit any snags or want me to walk you through it, just let me know—I’m here to help!
s
eou_delay = summary.get("eou.end_of_utterance_delay", 0) AttributeError: 'UsageSummary' object has no attribute 'get'
t
The error happens because the UsageSummary object doesn’t have a get method. No worries though—you just need to access its attributes directly. Here’s a simple way to tweak your code:
Copy code
async def log_usage():
    summary = usage_collector.get_summary()
    <http://logger.info|logger.info>(f"Usage: {summary}")

    # Access attributes directly
    eou_delay = getattr(summary, "eou_end_of_utterance_delay", 0)
    llm_ttft = getattr(summary, "llm_ttft", 0)
    tts_ttfb = getattr(summary, "tts_ttfb", 0)
    total_latency = eou_delay + llm_ttft + tts_ttfb

    <http://logger.info|logger.info>(f"Latency: {total_latency} ms")
This way, getattr helps you safely grab those attributes and falls back to 0 if something’s missing. If you want me to walk you through it or have any other questions, just let me know—I’m here to help!
s
Usage: UsageSummary(llm_prompt_tokens=126, llm_prompt_cached_tokens=0, llm_completion_tokens=22, tts_characters_count=57, tts_audio_duration=3.204333333333333, stt_audio_duration=10.04999999999998) 2025-08-14 142231,242 - INFO __mp_main__ - Usage: UsageSummary(llm_prompt_tokens=126, llm_prompt_cached_tokens=0, llm_completion_tokens=22, tts_characters_count=57, tts_audio_duration=3.204333333333333, stt_audio_duration=10.04999999999998) INFO mp main Latency: 0 ms 2025-08-14 142231,243 - INFO __mp_main__ - Latency: 0 ms
t
Hey there! It looks like the UsageSummary object you provided doesn’t include the metrics eou_end_of_utterance_delay, llm_ttft, or tts_ttfb, which are needed to calculate the total latency. That’s why you’re seeing the latency show up as 0 ms. Here’s how we can fix this together: 1. Double-check Metrics Collection: Make sure your session is set up to collect these specific metrics. For example:
Copy code
@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    print(f"Raw metrics collected: {ev.metrics}")
    usage_collector.collect(ev.metrics)
2. Take a Peek at Raw Metrics: Try logging the raw metrics in the _on_metrics_collected handler to see if these metrics are actually being emitted. 3. Check Your Components: Confirm that the components you’re using (like STT, LLM, TTS) support and emit these metrics. Sometimes, certain models or setups might not provide them. If you want, I’m here to help you dig into this or walk through the setup! Just let me know.
s
LiveKit Docs › Building voice agents › Logs, metrics, & telemetry --- # Logs, metrics, and telemetry
Collecting logs, metrics, and telemetry data from your agent for debugging and insights.
## Overview LiveKit Agents provides built-in support for logging, collecting, and analyzing metrics to help you monitor and optimize your agent's performance. Agent sessions emit structured metrics events that can be logged in real time or aggregated to analyze latency and usage patterns. In addition to per-event metrics, LiveKit’s OpenTelemetry integration provides trace-based observability. This enables you to capture the execution flow of an agents's lifecycle—from session start to individual [node](https://docs.livekit.io/agents/build/nodes.md) operations. You can use any OpenTelemetry-compatible provider to collect and analyze telemetry data, giving you insight into conversation latency, tool usage, and performance bottlenecks. For information on log levels, see the [worker options](https://docs.livekit.io/agents/worker/options.md#log-levels) page. ## Logging events Agent metrics events are fired by the
AgentSession
whenever there is a new metrics object available during an active session. A
log_metrics
helper function is also provided to format logging output for each metric type.
Copy code
python
from livekit.agents import metrics, MetricsCollectedEvent

...

@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    metrics.log_metrics(ev.metrics)
## Aggregating metrics The
metrics
module also includes a
UsageCollector
helper class for aggregating usage metrics across a session. It tracks metrics such as LLM, TTS, and STT API usage, which can help estimate session cost.
Copy code
python
from livekit.agents import metrics, MetricsCollectedEvent

...

usage_collector = metrics.UsageCollector()

@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    usage_collector.collect(ev.metrics)

async def log_usage():
    summary = usage_collector.get_summary()
    <http://logger.info|logger.info>(f"Usage: {summary}")

# At shutdown, generate and log the summary from the usage collector
ctx.add_shutdown_callback(log_usage)
## Metrics reference ![Diagram where metrics are measured.](/images/agents/agents-capturing-metrics-v1.svg) ### Speech-to-text (STT)
STTMetrics
is emitted after the STT model has processed the audio input. This metrics is only available when an STT component is used, which does not apply to Realtime APIs. | Metric | Description | |
audio_duration
| The duration (seconds) of the audio input received by the STT model. | |
duration
| For non-streaming STT, the amount of time (seconds) it took to create the transcript. Always
0
for streaming STT. | |
streamed
|
True
if the STT is in streaming mode. | ### LLM
LLMMetrics
is emitted after each LLM inference completes. If the response includes tool calls, the event does not include the time taken to execute those calls. Each tool call response triggers a separate
LLMMetrics
event. | Metric | Description | |
duration
| The amount of time (seconds) it took for the LLM to generate the entire completion. | |
completion_tokens
| The number of tokens generated by the LLM in the completion. | |
prompt_tokens
| The number of tokens provided in the prompt sent to the LLM. | |
prompt_cached_tokens
| The number of cached tokens in the input prompt. | |
speech_id
| An unique identifier representing a turn in the user input. | |
total_tokens
| Total token usage for the completion. | |
tokens_per_second
| The rate of token generation (tokens/second) by the LLM to generate the completion. | |
ttft
| The amount of time (seconds) that it took for the LLM to generate the first token of the completion. | ### Text-to-speech (TTS)
TTSMetrics
is emitted after a TTS has generated speech from text input. | Metric | Description | |
audio_duration
| The duration (seconds) of the audio output generated by the TTS model. | |
characters_count
| The number of characters in the text input to the TTS model. | |
duration
| The amount of time (seconds) it took for the TTS model to generate the entire audio output. | |
ttfb
| The amount of time (seconds) that it took for the TTS model to generate the first byte of its audio output. | |
speech_id
| An identifier linking to a user's turn. | |
streamed
|
True
if the TTS is in streaming mode. | ### End-of-utterance (EOU)
EOUMetrics
is emitted when the user is determined to have finished speaking. It includes metrics related to end-of-turn detection and transcription latency. This event is only available in Realtime APIs when
turn_detection
is set to either VAD or LiveKit's turn detector plugin. When using server-side turn detection, EOUMetrics is not emitted, as this information is not available. | Metric | Description | |
end_of_utterance_delay
| Time (in seconds) from the end of speech (as detected by VAD) to the point when the user's turn is considered complete. This includes any
transcription_delay
. | |
transcription_delay
| Time (seconds) between the end of speech and when final transcript is available | |
on_user_turn_completed_delay
| Time (in seconds) taken to execute the
on_user_turn_completed
callback. | |
speech_id
| A unique identifier indicating the user's turn. | ## Measuring conversation latency Total conversation latency is defined as the time it takes for the agent to respond to a user's utterance. Given the metrics above, it can be computed as follows:
Copy code
python
total_latency = eou.end_of_utterance_delay + llm.ttft + tts.ttfb
## Telemetry LiveKit's [OpenTelemetry](https://opentelemetry.io/docs/) integration automatically collects telemetry data from your agents and publishes it to any OpenTelemetry-compatible provider you choose. This enables monitoring and analysis of your agent's behavior and performance. ### Collected data A trace represents the execution flow of a single request within an LLM application. It captures all relevant steps, including duration and metadata. Agent telemetry records traces for the following activities: - Session start - Agent turn - LLM node - Function tool - TTS node - End-of-turn detection - LLM and TTS metrics ### Enabling telemetry To enable telemetry, configure a tracer provider using
set_tracer_provider
in your entrypoint function. You can use any [OpenTelemetry-compatible provider](https://opentelemetry.io/ecosystem/vendors/). The following example uses [LangFuse](https://langfuse.com/docs/opentelemetry/get-started). Set the required public key, secret key, and host as environment variables:
Copy code
python
import base64
import os

from livekit.agents.telemetry import set_tracer_provider

def setup_langfuse(
    host: str | None = None, public_key: str | None = None, secret_key: str | None = None
):
    from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
    from opentelemetry.sdk.trace import TracerProvider
    from opentelemetry.sdk.trace.export import BatchSpanProcessor

    public_key = public_key or os.getenv("LANGFUSE_PUBLIC_KEY")
    secret_key = secret_key or os.getenv("LANGFUSE_SECRET_KEY")
    host = host or os.getenv("LANGFUSE_HOST")

    if not public_key or not secret_key or not host:
        raise ValueError("LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and LANGFUSE_HOST must be set")

    langfuse_auth = base64.b64encode(f"{public_key}:{secret_key}".encode()).decode()
    os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = f"{host.rstrip('/')}/api/public/otel"
    os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {langfuse_auth}"

    trace_provider = TracerProvider()
    trace_provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
    set_tracer_provider(trace_provider)

async def entrypoint(ctx: JobContext):
    setup_langfuse()  # set up the langfuse tracer provider

    # ...
### Trace example The following diagram shows a trace of an agent session with user turns. ![Diagram showing a trace of an agent session with two user turns.](/images/agents/agents-telemetry-trace-example.png) ### Example For a full example, see the following in the LiveKit Agents GitHub repository. - **[LangFuse trace example](https://github.com/livekit/agents/blob/main/examples/voice_agents/langfuse_trace.py)**: An example of an agent using LangFuse as the tracer provider. --- This document was rendered at 2025-08-14T085723.387Z. For the latest version of this document, see [https://docs.livekit.io/agents/build/metrics.md](https://docs.livekit.io/agents/build/metrics.md). To explore all LiveKit documentation, see [llms.txt](https://docs.livekit.io/llms.txt).
t
The metrics you need for calculating total latency—eou.end_of_utterance_delay, llm.ttft, and tts.ttfb—are part of specific metric events. Here’s a friendly rundown based on the documentation: 1. Make Sure Components Are Emitting Metrics: - EOU Metrics: These come through when using turn detection (like VAD or LiveKit’s turn detector plugin). - LLM Metrics: Double-check that your LLM model supports and actually emits metrics like ttft. - TTS Metrics: Make sure your TTS model is sending metrics like ttfb. 2. Add Some Logging to See What’s Happening: Pop this into your _on_metrics_collected handler to peek at all the raw metrics coming in:
Copy code
@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    print(f"Raw metrics collected: {ev.metrics}")
    usage_collector.collect(ev.metrics)
3. Check What You’ve Got: Look at the logged output to see if eou.end_of_utterance_delay, llm.ttft, and tts.ttfb show up. If any are missing, it’s worth double-checking your component setup. If you want, I’m here to help you verify or tweak these metrics—just let me know!
s
we need to get the attributes right: type='stt_metrics' label='livekit.plugins.deepgram.stt.STT' request_id='908d04d0-57a6-45e3-9394-bb20ffbe8b2e' timestamp=1755161520.6885087 duration=0.0 audio_duration=4.99999999999999 streamed=True DEBUGlivekit.plugins.turn detectoreou prediction 2025-08-14 142200,733 - DEBUG livekit.plugins.turn_detector - eou prediction {"eou_probability": 0.0570409893989563, "input": "<|im_start|>assistant\nनमस्ते मैं एक अनुवादक हूँ<|i im_end|>\n<|im_start|>user\nit diverts independence in nineteen forty seven", "duration": 0.103} Raw metrics collected: type='eou_metrics' timestamp=1755161520.7361202 end_of_utterance_delay=1.2070648670196533 transcription_delay=1.0958454608917236 on_user_turn_completed_delay=0.0 last_speaking_time=1755161519.5280437 speech_id='speech_edc1430c08e7' Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161521.3799064 idle_time=1.3967138000007253 inference_duration_total=0.036652099994171294 inference_count=32 Raw metrics collected: type='llm_metrics' label='livekit.plugins.openai.llm.LLM' request_id='chatcmpl-C4NyjJKqsQ2hDyv93anbsEmbqsOGB' timestamp=1755161522.0080254 duration=1.2705581999998685 ttft=0.9315940999986196 cancelled=False completion_tokens=18 prompt_tokens=48 prompt_cached_tokens=0 total_tokens=66 tokens_per_second=14.167001558843872 speech_id='speech_edc1430c08e7' Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161522.3841383 idle_time=2.4005636000001687 inference_duration_total=0.029413199999908102 inference_count=32 Raw metrics collected: type='tts_metrics' label='livekit.plugins.cartesia.tts.TTS' request_id='a3ce7d2f5e19' timestamp=1755161523.2223134 ttfb=0.35482819999924686 duration=1.2131401999995433 audio_duration=2.5077499999999997 cancelled=False characters_count=49 streamed=True segment_id='15dad0a77a88' speech_id='speech_edc1430c08e7' Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161523.434709 idle_time=3.4508466000006592 inference_duration_total=0.037958000002618064 inference_count=32 Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161524.4331243 idle_time=4.449569700000211 inference_duration_total=0.03864229999453528 inference_count=32 Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161525.481977 idle_time=5.498287400001573 inference_duration_total=0.03552980000677053 inference_count=32 Raw metrics collected: type='stt_metrics' label='livekit.plugins.deepgram.stt.STT' request_id='908d04d0-57a6-45e3-9394-bb20ffbe8b2e' timestamp=1755161525.728863 duration=0.0 audio_duration=5.04999999999999 streamed=True Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161526.4811485 idle_time=0.6002809999990859 inference_duration_total=0.02934070000083011 inference_count=32 User input transcribed: Thank you., final: False, speaker id: None User input transcribed: Thank you., final: True, speaker id: None DEBUGlivekit.agentsreceived user transcript 2025-08-14 142206,792 - DEBUG livekit.agents - received user transcript {"user_transcript": "Thank you.", "language": "en-US"} DEBUGlivekit.plugins.turn detectoreou prediction 2025-08-14 142207,091 - DEBUG livekit.plugins.turn_detector - eou prediction {"eou_probability": 0.874458909034729, "input": "<|im_start|>assistant\nनमस्ते मैं एक अनुवादक हूँ<|im m_end|>\n<|im_start|>user\nit diverts independence in nineteen forty seven<|im_end|>\n<|im_start|>assistant\nयह उन्नीस सौ सत्तावन में स्वतंत्रता को मोड़ता है<|im_end|>\n<|im_start|> >user\nthank you", "duration": 0.29} Raw metrics collected: type='eou_metrics' timestamp=1755161527.0923357 end_of_utterance_delay=1.0162062644958496 transcription_delay=0.7171869277954102 on_user_turn_completed_delay=0.0 last_speaking_time=1755161526.0761294 speech_id='speech_2b6951d345f1' Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161527.5301166 idle_time=1.0479147000005469 inference_duration_total=0.04289810000045691 inference_count=32 Raw metrics collected: type='llm_metrics' label='livekit.plugins.openai.llm.LLM' request_id='chatcmpl-C4NypFvwDPiA20oZUzhpUk3YEuhul' timestamp=1755161527.8801813 duration=0.7865351000000373 ttft=0.6922188999997161 cancelled=False completion_tokens=4 prompt_tokens=78 prompt_cached_tokens=0 total_tokens=82 tokens_per_second=5.085596307144856 speech_id='speech_2b6951d345f1' Raw metrics collected: type='tts_metrics' label='livekit.plugins.cartesia.tts.TTS' request_id='3e8a9ecbc300' timestamp=1755161528.3640666 ttfb=0.33360759999959555 duration=0.483649299998433 audio_duration=0.6965833333333334 cancelled=False characters_count=8 streamed=True segment_id='cd193b9a3465' speech_id='speech_2b6951d345f1' Raw metrics collected: type='vad_metrics' label='livekit.plugins.silero.vad.VAD' timestamp=1755161528.5367868 idle_time=2.053890400000455 inference_duration_total=0.029906699997809483 inference_count=32 INFOlivekit.agentsclosing agent session due to participant disconnect (disable via
RoomInputOptions.close_on_disconnect=False
) 2025-08-14 142209,502 - INFO livekit.agents - closing agent session due to participant disconnect (disable via
RoomInputOptions.close_on_disconnect=False
) {"participant": "voice_assistant_user_7822", "reason": "CLIENT_INITIATED"} DEBUGlivekit.agentsstream closed 2025-08-14 142209,503 - DEBUG livekit.agents - stream closed {"participant": "voice_assistant_user_7822", "source": "SOURCE_MICROPHONE"} DEBUGlivekit.agentsinput stream detached 2025-08-14 142209,510 - DEBUG livekit.agents - input stream detached {"participant": "voice_assistant_user_7822", "source": "SOURCE_UNKNOWN", "accepted_sources": ["SOURCE_MICROPHONE"]} DEBUGlivekit.agentssession closed 2025-08-14 142209,518 - DEBUG livekit.agents - session closed {"reason": "participant_disconnected", "error": null} DEBUGlivekit.agentsshutting down job task 2025-08-14 142231,234 - DEBUG livekit.agents - shutting down job task {"reason": "", "user_initiated": false} DEBUGlivekit.agentsjob exiting 2025-08-14 142231,238 - DEBUG livekit.agents - job exiting {"reason": "", "tid": 16756, "job_id": "AJ_RStaTFF9MwwV"} INFO mp main Usage: UsageSummary(llm_prompt_tokens=126, llm_prompt_cached_tokens=0, llm_completion_tokens=22, tts_characters_count=57, tts_audio_duration=3.204333333333333, stt_audio_duration=10.04999999999998) 2025-08-14 142231,242 - INFO __mp_main__ - Usage: UsageSummary(llm_prompt_tokens=126, llm_prompt_cached_tokens=0, llm_completion_tokens=22, tts_characters_count=57, tts_audio_duration=3.204333333333333, stt_audio_duration=10.04999999999998)
t
From the raw metrics you shared, it looks like the attributes needed to calculate total latency are there — we just need to grab them the right way. Here’s a friendly nudge on how you can tweak your code: 1. Access Metrics by Type: Use the raw metrics you’ve collected to pull out the values for end_of_utterance_delay, llm.ttft, and tts.ttfb. Something like this should do the trick:
Copy code
@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    for metric in ev.metrics:
        if metric.type == "eou_metrics":
            eou_delay = metric.end_of_utterance_delay
        elif metric.type == "llm_metrics":
            llm_ttft = metric.ttft
        elif metric.type == "tts_metrics":
            tts_ttfb = metric.ttfb

    total_latency = eou_delay + llm_ttft + tts_ttfb
    <http://logger.info|logger.info>(f"Total Latency: {total_latency} seconds")
2. Double-Check Metric Types: Make sure the metrics are popping up with the right types (eou_metrics, llm_metrics, and tts_metrics) and that they include the attributes you need. 3. Add Logs to Help Debug: Throw in some debug logs so you can keep an eye on whether the metrics are being caught and handled properly. If you want a hand with any of this or run into any bumps, just let me know — I’m here to help!