Is there a way to make GPT5 think for less time?
# ask-ai
r
Is there a way to make GPT5 think for less time?
t
Thanks for reaching out, Trevor. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Could you help me understand a bit better—are you asking about speeding up GPT-5’s response time specifically within a LiveKit app, or is it something else? If it’s about LiveKit, I’d love to know more about how you’ve got things set up!
r
Currently, GPT-5 thinks for too long for a responsive conversation, but I think OpenAI exposes a faster endpoint / possibly a way to limit how long it thinks for
t
Hey there! Just wanted to let you know that LiveKit works great with OpenAI models like GPT-4 and GPT-3.5, though we don’t have any word yet on GPT-5 support. If you’re aiming to speed things up, here are some handy tips: • Use the OpenAI Realtime API: It’s built for lightning-fast AI responses (around 300ms). • Pick Faster Models: Models like GPT-3.5 tend to be quicker than the more advanced ones. • Streaming Optimizations: Try processing responses bit by bit, use smaller audio frame sizes (like 20ms instead of 100ms), and encourage the model to keep answers concise. • Make the Most of Your Setup: Take advantage of WebRTC for quick audio transfer, global edge network routing, and built-in load balancing. If you have access to OpenAI's endpoints, it’s worth checking out their docs for special parameters that can help limit response time or speed things up even more. Sources: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/models.py | Live Conversations with AI using ChatGPT and WebRTC | OpenAI and LiveKit partner to turn Advanced Voice into an API
r
What does _max_completion_tokens do?_
t
The
max_completion_tokens
parameter in the OpenAI API lets you set a limit on how many tokens the model can generate in its response. Tokens are basically chunks of text—like words or parts of words. By tweaking this setting, you can easily control how long the model’s replies are, which helps avoid responses that are way too long. When you’re working with LiveKit, you can set this parameter either when you’re initializing the language model or right in the chat method, so you have the flexibility to keep responses just the right size.
r
can I do this just for thinking?
t
Hey there! So, the
max_completion_tokens
parameter basically sets the total length of the output, covering both the thinking and content creation parts. But just a heads-up—it doesn’t specifically limit the thinking time alone. Some models might have something like a
thinking_budget
parameter to control how long they spend thinking, but that’s pretty model-specific and isn’t something you’ll find directly in the OpenAI API. If you want to limit thinking time specifically, you might need to dig into model-specific settings or try out other approaches. Hope that helps! Sources: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | Adjusting Model Parameters | LiveKit Docs | tools/server/README.md
r
how do I adjust the params that get sent to openAI? I want to use things like reasoning_effort and verbosity
t
Hey there! Right now, it looks like parameters like
reasoning_effort
and
verbosity
aren’t directly adjustable through OpenAI API requests via LiveKit. They don’t seem to be listed as configurable options in the docs. If you're hoping to tweak how the model behaves, there might be other parameters or settings we can explore together. Just let me know if you want a hand with that or if you have any questions—I’m here to help!
r
this is part of the body that would get sent to openai if I made the curl requests myself, can I not adjust these myself?
t
Hey there! Just a quick heads-up: the documentation doesn’t clearly say whether you can tweak parameters like
reasoning_effort
and
verbosity
directly within LiveKit's OpenAI integration. If you're sending API requests straight to OpenAI, you might be able to include those parameters in the request body, but this isn’t something that’s officially covered in LiveKit’s docs. If you’re hoping to customize the OpenAI API request body while using LiveKit, just give me a shout! I’d be happy to help out based on what you’re working with or what you need.
r
If you’re hoping to customize the OpenAI API request body while using LiveKit, just give me a shout! I’d be happy to help out based on what you’re working with or what you need.
This is what I want to do
t
Hey there! Just a quick heads-up: the docs don’t specifically mention that you can tweak parameters like
reasoning_effort
and
verbosity
directly within LiveKit’s OpenAI integration. If you’re sending API requests straight to OpenAI, you might be able to add these parameters in the request body, but this isn’t clearly outlined as a LiveKit feature. If you’re hoping to customize the OpenAI API request body inside LiveKit, just give me a shout! I’d be happy to help out based on what you’re working with or what you need. Sources: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/real...