while use realtime models i collect summary data, ...
# ask-ai
l
while use realtime models i collect summary data, openai realtime models return this data: "llm_output_audio_tokens": 1500, but gemini always return zero "llm_output_audio_tokens": 0 and
llm_input_audio_tokens=21, llm_completion_tokens=80
very small this usage but i speak 1 minute, i think correct usage gpt:
llm_completion_tokens=1753, llm_output_audio_tokens=1346
openai summary:
Duration: 90.30s UsageSummary(llm_prompt_tokens=2060, llm_prompt_cached_tokens=1408, llm_input_audio_tokens=1455, llm_completion_tokens=1753, llm_output_audio_tokens=1346, tts_characters_count=0, tts_audio_duration=0.0, stt_audio_duration=0.0)
gemini summary:
Duration: 82.35s, Usage: UsageSummary(llm_prompt_tokens=5354, llm_prompt_cached_tokens=0, llm_input_audio_tokens=21, llm_completion_tokens=80, llm_output_audio_tokens=0, tts_characters_count=0, tts_audio_duration=0.0, stt_audio_duration=0.0)
t
Thanks for reaching out, biberim45. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Thank you for reaching out! I’d love to help clarify the difference in token usage metrics between OpenAI and Gemini models. The key lies in how audio and text tokens are processed and reported. OpenAI's API provides detailed audio token metrics, which LiveKit uses directly. This means you get accurate counts for
llm_input_audio_tokens
,
llm_output_audio_tokens
, and
llm_completion_tokens
. On the other hand, Gemini handles tokens a bit differently. While it can track audio tokens, its API might not always generate or show detailed audio token counts in the usage metadata. That’s why you might see zero or very low values for
llm_output_audio_tokens
and related metrics. If you have any more questions or specific needs around token tracking, please don’t hesitate to ask. I’m here to help you every step of the way! Sources: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/models.py | plugins/openai/src/realtime/realtime_model.ts
l
how can i track gemini realtime token usage
t
I'm here to help you track Gemini realtime token usage in LiveKit with a few simple steps: 1. Install the Google Plugin: Just run
pip install "livekit-agents[google]~=1.2"
to get started. 2. Configure Authentication: Make sure to set your
GOOGLE_API_KEY
or
GOOGLE_APPLICATION_CREDENTIALS
environment variable depending on the API type you’re using. 3. Initialize the RealtimeModel: Create an instance of
RealtimeModel
inside an
AgentSession
to connect with the Gemini Live API. Here’s an example to guide you:
Copy code
from livekit.plugins import google
from livekit.agents import AgentSession

session = AgentSession(
    llm=google.beta.realtime.RealtimeModel(
        model="gemini-2.0-flash-exp",
        voice="Puck",
        temperature=0.8,
        instructions="You are a helpful assistant"
    )
)
4. Listen for Token Usage Metrics: Capture token usage by listening to the
metrics_collected
event like this:
Copy code
@session.llm.on("metrics_collected")
def on_metrics(metrics):
    print("Input tokens:", metrics.input_tokens)
    print("Output tokens:", metrics.output_tokens)
    print("Audio input tokens:", metrics.input_token_details.audio_tokens)
5. Aggregate Usage (Optional): If you want to track cumulative metrics, the
UsageCollector
class makes it easy:
Copy code
from livekit.agents.metrics import UsageCollector

collector = UsageCollector()

@session.llm.on("metrics_collected")
def on_metrics(metrics):
    collector.collect(metrics)

summary = collector.get_summary()
print("Total audio input tokens:", summary.llm_input_audio_tokens)
Following these steps will help you keep a close eye on your token usage. If anything feels unclear or you’d like some extra guidance, please don’t hesitate to reach out—I’m here to support you! Sources: Gemini Live API integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/real...