OpenAI s `gpt 4o mini transcribe` is detecting non English w LiveKit Community #ask-ai

OpenAI's `gpt-4o-mini-transcribe` is detecting non...

mysterious-agent-76276

09/13/2025, 9:31 AM

OpenAI's

gpt-4o-mini-transcribe

is detecting non-English words even though the config specifies the language:

Copy code

stt=openai.STT(
    api_key=config.stt.openai_api_key,  # type: ignore
    model=config.stt.openai_model,
    language="en",
    use_realtime=True,
)

tall-airline-16756

09/13/2025, 9:31 AM

Thanks for reaching out, Anas. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

09/13/2025, 9:32 AM

Thank you for your question! The

gpt-4o-mini-transcribe

model might pick up non-English words even when the language is set to "en" because the

language

parameter acts more like a helpful hint rather than a strict rule. Since the model is naturally multilingual, it’s designed to recognize a variety of speech patterns, including situations where people switch between languages or mix them. If you’re aiming for transcription strictly in English, here are a few options you might find useful: • Try pre-processing your audio to filter out or reduce any non-English speech parts. • After transcription, you could use language identification tools to spot and either remove or flag any non-English text. • If possible, use contextual prompts to encourage English-only input—though just a heads-up, this isn’t officially supported for

gpt-4o-mini-transcribe

. Please feel free to reach out if you’d like help with any of these steps—I’m here to support you! Sources: OpenAI STT integration guide | LiveKit Docs | examples/voice_agents/structured_output.py

mysterious-agent-76276

09/13/2025, 9:33 AM

is there any other openai model that enforces the english-only rules? right now it's detecting non-english even though I'm only speaking english. so it's a matter of inaccuracy

tall-airline-16756

09/13/2025, 9:34 AM

I'm glad you asked about OpenAI's models like

whisper-1

and

gpt-4o-transcribe

! You can set the

language

parameter to "en" to help guide the transcription towards English. That said, because these models are trained on multiple languages, they might still pick up some non-English words occasionally. If you’re noticing some inaccuracies with English input, it might be due to audio quality or other factors. Here are a few friendly suggestions you could try: • Give the

whisper-1

model a shot—it’s especially designed for speech-to-text tasks. • Consider improving the audio clarity beforehand, which can make a big difference. • Look into post-processing the transcription to filter or flag any non-English parts. Feel free to reach out if you want a hand with any of these steps or if you’d like me to explain anything further. I’m here to help! Sources: OpenAI integration guide | LiveKit Docs | OpenAI STT integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-fal/livekit/plugins/fal/stt.py

Open in Slack

Previous Next