late-helmet-58096
09/12/2025, 6:58 AMllm_input_audio_tokens=21, llm_completion_tokens=80
very small this usage but i speak 1 minute, i think correct usage gpt: llm_completion_tokens=1753, llm_output_audio_tokens=1346
openai summary:
Duration: 90.30s UsageSummary(llm_prompt_tokens=2060, llm_prompt_cached_tokens=1408, llm_input_audio_tokens=1455, llm_completion_tokens=1753, llm_output_audio_tokens=1346, tts_characters_count=0, tts_audio_duration=0.0, stt_audio_duration=0.0)
gemini summary: Duration: 82.35s, Usage: UsageSummary(llm_prompt_tokens=5354, llm_prompt_cached_tokens=0, llm_input_audio_tokens=21, llm_completion_tokens=80, llm_output_audio_tokens=0, tts_characters_count=0, tts_audio_duration=0.0, stt_audio_duration=0.0)
tall-airline-16756
09/12/2025, 6:58 AMtall-airline-16756
09/12/2025, 6:59 AMllm_input_audio_tokens
, llm_output_audio_tokens
, and llm_completion_tokens
.
On the other hand, Gemini handles tokens a bit differently. While it can track audio tokens, its API might not always generate or show detailed audio token counts in the usage metadata. That’s why you might see zero or very low values for llm_output_audio_tokens
and related metrics.
If you have any more questions or specific needs around token tracking, please don’t hesitate to ask. I’m here to help you every step of the way!
Sources: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/models.py | plugins/openai/src/realtime/realtime_model.tslate-helmet-58096
09/12/2025, 7:00 AMtall-airline-16756
09/12/2025, 7:02 AMpip install "livekit-agents[google]~=1.2"
to get started.
2. Configure Authentication: Make sure to set your GOOGLE_API_KEY
or GOOGLE_APPLICATION_CREDENTIALS
environment variable depending on the API type you’re using.
3. Initialize the RealtimeModel: Create an instance of RealtimeModel
inside an AgentSession
to connect with the Gemini Live API. Here’s an example to guide you:
from livekit.plugins import google
from livekit.agents import AgentSession
session = AgentSession(
llm=google.beta.realtime.RealtimeModel(
model="gemini-2.0-flash-exp",
voice="Puck",
temperature=0.8,
instructions="You are a helpful assistant"
)
)
4. Listen for Token Usage Metrics: Capture token usage by listening to the metrics_collected
event like this:
@session.llm.on("metrics_collected")
def on_metrics(metrics):
print("Input tokens:", metrics.input_tokens)
print("Output tokens:", metrics.output_tokens)
print("Audio input tokens:", metrics.input_token_details.audio_tokens)
5. Aggregate Usage (Optional): If you want to track cumulative metrics, the UsageCollector
class makes it easy:
from livekit.agents.metrics import UsageCollector
collector = UsageCollector()
@session.llm.on("metrics_collected")
def on_metrics(metrics):
collector.collect(metrics)
summary = collector.get_summary()
print("Total audio input tokens:", summary.llm_input_audio_tokens)
Following these steps will help you keep a close eye on your token usage. If anything feels unclear or you’d like some extra guidance, please don’t hesitate to reach out—I’m here to support you!
Sources: Gemini Live API integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/real...