l
hi
t
Thanks for reaching out, Pavan. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Hey there! What can I help you with today?
l
Traceback (most recent call last): File "C:\Codeshastra_Projects\multilingual-health-assistant-backend\main.py", line 97, in <module> azure_llm = LLM.with_azure( ^^^^^^^^^^^^^^^ TypeError: LLM.with_azure() got an unexpected keyword argument 'max_tokens' (venv) C:\Codeshastra_Projects\multilingual-health-assistant-backend>
t
Hey there! The error you're seeing happens because the
LLM.with_azure()
method doesn’t recognize a parameter called
max_tokens
. No worries though—you’ll want to use
max_response_output_tokens
instead to set the max number of tokens in the response. Here’s a quick fix for your code:
Copy code
from livekit.plugins.openai import LLM
azure_llm = LLM.with_azure(
    model="gpt-4o",
    temperature=0.8,
    max_response_output_tokens=1000  # Use this instead of max_tokens
)
Also, don’t forget to include other important parameters like
azure_deployment
,
azure_endpoint
,
api_version
, and your authentication details (
api_key
or
entra_token
) if you need them. Sources: LLM | LiveKit Agents | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/stt.py | Azure integration guide | LiveKit Docs | Azure OpenAI LLM integration guide | LiveKit Docs
l
azure_llm = LLM.with_azure( ^^^^^^^^^^^^^^^ TypeError: LLM.with_azure() got an unexpected keyword argument 'max_response_output_tokens'
t
Hey there! Just a quick heads-up: the
LLM.with_azure()
method doesn’t support the parameter
max_response_output_tokens
. No worries though—you can use
max_completion_tokens
instead to control how long the response is when you're working with the standard
LLM
class. Here’s a little tweak to your code:
Copy code
from livekit.plugins.openai import LLM
azure_llm = LLM.with_azure(
    model="gpt-4o",
    temperature=0.8,
    max_completion_tokens=1000  # Use this instead of max_response_output_tokens
)
And if you’re dealing with real-time audio and really need
max_response_output_tokens
, then the way to go is using
RealtimeModel.with_azure()
instead of
LLM.with_azure()
. Hope that helps! Sources: LiveKit Docs | plugins/openai/src/llm.ts | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py