gemini llm call is complaining about max prompt to...
# ask-ai
w
gemini llm call is complaining about max prompt token...how to fix this?
t
Thanks for reaching out, Sunil. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
The "max prompt token" error occurs when the prompt exceeds the model's context size. To resolve this: • Reduce Prompt Length: Simplify or truncate your prompt to focus on the most important details. • Implement Truncation Strategies: For example, truncate the middle portion of the prompt while keeping the beginning and end to maintain context. • Adjust Model Parameters: If possible, use a model with a larger context window or adjust parameters like temperature for better results. • Monitor Token Usage: Use tools to track token usage in real-time and dynamically adjust prompts. Let me know if you need further assistance or specific guidance on implementing these steps.
w
what does this really mean? - UsageSummary(llm_prompt_tokens=194688, llm_prompt_cached_tokens=71644, llm_completion_tokens=1366, tts_characters_count=2681, tts_audio_duration=200.03999999999974, stt_audio_duration=268.7999999999997)
t
Here’s a quick rundown of what the
UsageSummary
fields mean for your session: • llm_prompt_tokens: This counts the tokens in the input prompt you sent to the LLM. • llm_prompt_cached_tokens: These are tokens pulled from the cache instead of being generated fresh. • llm_completion_tokens: Tokens that the LLM generated in its response. • tts_characters_count: The number of characters in the text you fed into the Text-to-Speech (TTS) model. • tts_audio_duration: How long (in seconds) the audio output from the TTS model is. • stt_audio_duration: How long (in seconds) the audio input processed by the Speech-to-Text (STT) model is. These little metrics give you a handy way to keep track of how you’re using the LLM, TTS, and STT features during your session. If anything’s unclear or you want to dive deeper, just let me know—I’m here to help! Sources: livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py
w
the llm prompt tokens seems unusually high...the instructions provided is around 20k english characters
t
Hey there! The high
llm_prompt_tokens
count might be happening because of things like tokenization, cached tokens, system prompts, or some extra formatting tokens. If you could share a bit more about your input—like how it’s structured, any special formatting, or if it includes special characters or templates—that would really help us figure out what’s going on. Sources: livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py | Capturing metrics | LiveKit Docs | tools/tts/tts.cpp
w
the actually prompt is not more than 30,000 characters including additional json which is part of the instructions sent to the agent. Any additional prompt is user prompt generated by the STT pipeline. is there a flush that should be done before starting the agent?
t
I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!