What does an llm ttft metric of 1 mean LiveKit Community #ask-ai

Join Slack

What does an llm ttft metric of -1 mean?

# ask-ai

important-judge-78504

06/27/2025, 3:09 PM

What does an llm ttft metric of -1 mean?

tall-airline-16756

06/27/2025, 3:09 PM

Thanks for reaching out, Ed. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

06/27/2025, 3:10 PM

In the context of the RealtimeModelMetrics class, the "ttft" metric represents "Time to first audio token in seconds." A value of -1 specifically indicates that "no audio token was sent" during the response generation. Sources: GitHub::livekit::agents | livekit-plugins/livekit-plugins-openai/livekit/pl...

important-judge-78504

06/27/2025, 3:10 PM

so practically, what does that mean? give me a scenario example

tall-airline-16756

06/27/2025, 3:10 PM

Hey there! Let me break down this tricky technical scenario in a way that feels more like we're just chatting. So, imagine you're waiting for a response, but something goes wrong behind the scenes. In this case, with our audio generation system, if no audio tokens ever show up - maybe because of an error or the request got cancelled - the Time to First Token (TTFT) sits at -1. It's basically the system's way of saying, "Oops, nothing happened!" Think of it like ordering a coffee and the barista never starts brewing - the timer just stays at zero. Pretty straightforward, right? 😊 Sources: livekit.agents.llm.llm API documentation | GitHub::livekit::agents | livekit-agents/livekit/agents/llm/llm.py

important-judge-78504

06/27/2025, 3:12 PM

so something prevented the llm generation from working but we dont know what

tall-airline-16756

06/27/2025, 3:12 PM

Hey there! I totally get what you're asking about. When you see a TTFT (Time to First Token) metric of -1, it basically means something went wrong before the AI could even start generating audio. It's like when you're trying to make a phone call and the call won't connect - we know something interrupted the process, but we're not quite sure exactly what. This could happen for a bunch of different reasons: maybe there was a hiccup in the request, someone cancelled midway, or there might be a small glitch in the system. Think of it as a "system prevented from starting" signal rather than a detailed explanation.

important-judge-78504

07/09/2025, 8:00 PM

Im seeing these warnings:

Copy code

{
  "message": "inference is slower than realtime",
  "level": "WARNING",
  "name": "livekit.plugins.silero",
  "delay": 0.23960410502180454,
  "pid": 2603,
  "job_id": "AJ_N9ezZkEhdZ2Y",
  "timestamp": "2025-07-09T18:57:35.187367+00:00"
}

and

Copy code

{
  "message": "Error in _bounce_eou_task\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.11/site-packages/livekit/agents/utils/log.py\", line 16, in async_fn_logs\n return await fn(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/site-packages/livekit/agents/voice/audio_recognition.py\", line 339, in _bounce_eou_task\n end_of_turn_probability = await turn_detector.predict_end_of_turn(chat_ctx)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/site-packages/livekit/plugins/turn_detector/base.py\", line 220, in predict_end_of_turn\n assert result is not None, \"end_of_utterance prediction should always returns a result\"\n ^^^^^^^^^^^^^^^^^^\nAssertionError: end_of_utterance prediction should always returns a result",
  "level": "ERROR",
  "name": "livekit.agents",
  "pid": 21116,
  "job_id": "AJ_Qpid3jKtrQyF",
  "timestamp": "2025-07-09T19:20:12.360522+00:00"
}

i've been told that both of these might be happening because of not enough CPU or memory. However, my service is deployed in ECS and i dont see any concerning CPU or memory issues on any ECS tasks, would you happen to know what else might have resource constraints?

tall-airline-16756

07/09/2025, 8:00 PM

I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!

Open in Slack

Previous Next