I m using the standard LLM integration not Realtime with `gp LiveKit Community #ask-ai

I’m using the standard LLM integration (not Realti...

glamorous-architect-23034

09/10/2025, 4:19 PM

I’m using the standard LLM integration (not Realtime) with

gpt-4o

via

AgentSession

, but images via

ImageContent

still aren’t reaching OpenAI, when images are attached at end of turn. Here’s my setup:

Copy code

python
# Using AgentSession with standard LLM (not Realtime)
session = AgentSession(
    stt=deepgram.STT(),
    llm=openai.LLM(model="gpt-4o"),  # Standard LLM, not Realtime
    tts=openai.TTS()
)

# In on_user_turn_completed:
async def on_user_turn_completed(self, ctx: ChatContext, msg: ChatMessage):
    # msg contains ImageContent items
    <http://logger.info|logger.info>(f"Message has {len([c for c in msg.content if isinstance(c, ImageContent)])} images")
    # Logs show: "Message has 1 images"
    
    # But OpenAI debug logs show no image_url in the actual API request

With LK_OPENAI_DEBUG=1, I can see the chat context being sent, but it only contains text content - no image_url fields. Images were never included in the request to OpenAI, even though logs showed them being attached at end of turn, as intended. I then made only one change — setting

preemptive_generation=false

— and then images started making it to OpenAI and agent was able to describe what was showing in the device’s camera during a call. It appears there’s some race happening when preemptive_engeration is enabled, where the message with the image attached at end of turn never supersedes whatever was sent to OpenAI for preemptive generation. Am I doing something wrong with my implementation that causes vision to totally fail when preemptive_generation is enabled, or is this a bug in part of the LK stack?

tall-airline-16756

09/10/2025, 4:19 PM

Thanks for reaching out, Dan. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

09/10/2025, 4:21 PM

I understand the challenge you're facing with

ImageContent

not reaching OpenAI when

preemptive_generation

is enabled, and I’m here to help you through it. This happens because when

preemptive_generation

is turned on, the system starts creating responses right after receiving the user transcript, using whatever chat context is available at that moment. So, if

ImageContent

is added later (like at the end of the turn), it unfortunately won’t be part of that initial speculative request. That said, the preemptive response should ideally be invalidated and replaced with a new one that includes the full, updated chat context—including your

ImageContent

. If this replacement isn’t happening as it should, it might point to a synchronization hiccup in your implementation. To make sure your images are included, you could try updating the chat context with

ImageContent

before the speculative generation kicks off, or alternatively, consider turning off

preemptive_generation

if syncing everything immediately isn’t possible right now. Please feel free to reach out if you want me to walk you through this setup or explore other options together—I’m happy to assist! Sources: livekit-agents/livekit/agents/voice/speech_handle.py | Agent speech and audio | LiveKit Docs | Agent speech and audio | LiveKit Docs | agent/main.py | complex-agents/vision/agent.py

2 Views

Open in Slack

Previous Next