How can I use Azure for TTS?
# ask-ai
f
How can I use Azure for TTS?
t
Thanks for reaching out, Baymurat. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
To use Azure for Text-To-Speech (TTS) with LiveKit, you need to: • Install the
livekit-agents
package with Azure support:
pip install "livekit-agents[azure]~=1.0"
• Set up authentication by configuring environment variables in your
.env
file or pass credentials when creating the TTS instance:
AZURE_SPEECH_KEY
,
AZURE_SPEECH_REGION
, and optionally
AZURE_SPEECH_HOST
. • Create an Azure TTS instance and provide it to your
AgentSession
, for example:
Copy code
from livekit.plugins import azure

session = AgentSession(
    tts=azure.TTS(
        speech_key="<speech_service_key>",
        speech_region="<speech_service_region>",
        voice="en-US-JennyNeural",  # optional
        language="en-US",            # optional
        prosody=azure.ProsodyConfig(rate=1.0, volume=1.0, pitch="medium"),  # optional
    )
)
This allows you to customize voice, language, and prosody settings as needed. More detailed instructions and resources are available in the Azure Speech TTS integration guide. Sources: Azure Speech TTS integration guide | LiveKit Docs | tests/test_tts.py
f
which model should I use here? llm=openai.LLM(api_key=OPENAI_API_KEY, model="gpt-4.1"), is 4.1 valid?
t
Hey there! I noticed you're trying to use the model name
gpt-4.1
, but it looks like that's not quite right for OpenAI LLM usage in LiveKit. No worries, though! Here are the valid model names you can choose from: •
gpt-4o
gpt-4o-2024-05-13
gpt-4o-mini
gpt-4o-mini-2024-07-18
gpt-4-turbo
gpt-4-turbo-2024-04-09
When you're setting up your
LLM
instance, just pick one of these official model names. Here's a quick example:
Copy code
llm = OpenAI.LLM(api_key=OPENAI_API_KEY, model="gpt-4o")
Pro tip: Always double-check the supported models and their docs to make sure you're using the right one for your specific needs. I'm here if you need any more help! Sources: plugins/openai/src/llm.ts | livekit-plugins/livekit-plugins-groq/livekit/plugins/groq/models.py
f
what about using fil language in 11 labs?
t
Hey there! I totally get your curiosity about language support for ElevenLabs TTS with LiveKit. The current integration works with multiple languages using ISO-639-1 codes, but it looks like Filipino (with the code "fil") might be a bit of a question mark. Here's a friendly suggestion: go ahead and try specifying the
fil
language code in your ElevenLabs TTS settings. If it doesn't play nice, no worries! You can always reach out to ElevenLabs' support team or check their official API docs for the most up-to-date info on language support. We'll help you figure this out! Sources: ElevenLabs TTS integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.p... | TTSOptions | LiveKit Agents
f
can I switch auto detect language for eleven labs?
t
Hey there! I totally understand your frustration with the current LiveKit ElevenLabs TTS setup. Right now, the integration doesn't have built-in auto language detection - which I know can be a bit of a headache. What you can do is manually set the language by using the
language
parameter when you're configuring the ElevenLabs TTS plugin. It's not an automatic process, but it does give you some control. If you're really hoping for auto language detection, you'll need to create a separate language detection system that can identify the language before sending the text to the TTS service. I know it sounds like an extra step, but sometimes we have to get a bit creative to make things work smoothly! Sources: Groq integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/stt.py | TTSOptions | LiveKit Agents | plugins/elevenlabs/CHANGELOG.md | pipeline-tts/elevenlabs_change_language.py | ElevenLabs TTS integration guide | LiveKit Docs | ElevenLabs integration guide | LiveKit Docs | ElevenLabs integration guide | LiveKit Docs
f
is this correct implementation? stt=google.STT(credentials_info = dict(google_creds), languages=["fil-PH"]),
t
Hey there! I see you're working on setting up Google Speech-to-Text in LiveKit for Filipino language support. Your implementation looks spot on! 👍
Copy code
stt=google.STT(credentials_info=dict(google_creds), languages=["fil-PH"])
Let me break down what's great about your code: • You've nailed the
credentials_info
by using a dictionary for your Google Cloud credentials - nice job! • The
languages
parameter is perfectly configured with
fil-PH
for Filipino. That's exactly how you do it! 🇵🇭 Just a friendly reminder: double-check that your credentials are set up correctly and consider any additional parameters like
model
or
spoken_punctuation
that might be helpful for your specific project. If you want to dive deeper, the LiveKit docs on Google STT integration are super helpful. Feel free to reach out if you need any more guidance! 😊 Sources: Google Cloud STT integration guide | LiveKit Docs | Google integration guide | LiveKit Docs
f
but I am having this error: 2025-07-09 151945,114 - ERROR asyncio - Exception in callback PollerCompletionQueue._handle_events(<_UnixSelecto...e debug=False>)() handle: <Handle PollerCompletionQueue._handle_events(<_UnixSelecto...e debug=False>)()> Traceback (most recent call last): File "/opt/anaconda3/lib/python3.12/asyncio/events.py", line 88, in _run self._context.run(self._callback, *self._args) File "src/python/grpcio/grpc/_cython/_cygrpc/aio/completion_queue.pyx.pxi", line 147, in grpc._cython.cygrpc.PollerCompletionQueue._handle_events BlockingIOError: [Errno 35] Resource temporarily unavailable 2025-07-09 151945,115 ERROR livekit.agents: AgentSession is closing due to unrecoverable error Traceback (most recent call last): File "/opt/anaconda3/lib/python3.12/site-packages/google/api_core/grpc_helpers_async.py", line 106, in _wrapped_aiter async for response in self._call: # pragma: no branch File "/opt/anaconda3/lib/python3.12/site-packages/grpc/aio/_call.py", line 365, in _fetch_stream_responses await self._raise_for_status() File "/opt/anaconda3/lib/python3.12/site-packages/grpc/aio/_call.py", line 272, in _raise_for_status raise _create_rpc_error( grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "Config contains unsupported fields. See
error_details_ext
extension for details." debug_error_string = "UNKNOWN:Error received from peer ipv4216.58.211.10443 {grpc_message:"Config contains unsupported fields. See
error_details_ext
extension for details.", grpc_status:3}"
The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/opt/anaconda3/lib/python3.12/site-packages/livekit/plugins/google/stt.py", line 518, in _run task.result() File "/opt/anaconda3/lib/python3.12/site-packages/livekit/plugins/google/stt.py", line 419, in process_stream async for resp in stream: File "/opt/anaconda3/lib/python3.12/site-packages/google/api_core/grpc_helpers_async.py", line 109, in _wrapped_aiter raise exceptions.from_grpc_error(rpc_error) from rpc_error google.api_core.exceptions.InvalidArgument: 400 Config contains unsupported fields. See
error_details_ext
extension for details. [field_violations { field: "config.features.enable_automatic_punctuation" description: "Recognizer does not support feature: automatic_punctuation" } ] The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/opt/anaconda3/lib/python3.12/site-packages/livekit/agents/stt/stt.py", line 251, in _main_task return await self._run() ^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/livekit/plugins/google/stt.py", line 539, in _run raise APIStatusError( livekit.agents._exce
t
Hey there! I see you're running into a bit of a tricky situation with your Google Speech-to-Text setup. No worries, we'll work through this together! 😊
enable_automatic_punctuation
is causing a little hiccup with your current recognizer model. Let me break down what's happening - the system is saying:
Copy code
Recognizer does not support feature: automatic_punctuation
Here's a friendly fix: you'll want to either turn off the automatic punctuation or pick a model that plays nicely with this feature. It's an easy tweak! Here's a quick example of how you can adjust your code:
Copy code
google_stt = google.STT(
    credentials_info=dict(google_creds),
    languages=["fil-PH"],
    spoken_punctuation=False  # Disable automatic punctuation if enabled
)
Still feeling stuck? No problem! Just share your current Google STT code, and I'll be happy to help you troubleshoot. We've got this! 💪
f
can I disable auto detection here? stt=azure.STT(speech_key=AZURE_SPEECH_KEY, speech_region=AZURE_SPEECH_REGION, language="fil-PH"),
t
Hey there! You're absolutely right. By specifying the
language
parameter when setting up your Azure STT instance, you're telling the service exactly which language to use, which effectively turns off automatic language detection. Let me break down your example:
Copy code
stt=azure.STT(speech_key=AZURE_SPEECH_KEY, speech_region=AZURE_SPEECH_REGION, language="fil-PH")
This looks perfect! You're explicitly setting the language to Filipino (Philippines), so the service won't try to guess or switch languages on its own. Pro tip: This is actually the recommended approach for disabling auto language detection with Azure STT in LiveKit. Nice work! Sources: Azure Speech STT integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/stt.py