refined-van-59846
08/14/2025, 8:30 AMv1.2.5
) and are running into a consistent TimeoutError
during worker startup, but only when we try to load the MultilingualModel
for turn detection.
The Problem:
When we start our agent worker (python main.py dev
), the main process crashes with a TimeoutError
during inference_executor.initialize()
. The logs show that the lk_end_of_utterance_multilingual
inference runner is being initialized, but the main worker times out and kills the process after about 8-10 seconds.
Here is the key part of the traceback:
code
Code
ERROR - worker failed
Traceback (most recent call last):
...
File ".../livekit/agents/worker.py", line 402, in run
await self._inference_executor.initialize()
File ".../livekit/agents/ipc/supervised_proc.py", line 169, in initialize
init_res = await asyncio.wait_for(...)
File ".../asyncio/tasks.py", line 506, in wait_for
async with timeouts.timeout(timeout):
File ".../asyncio/timeouts.py", line 116, in __aexit__
raise TimeoutError from exc_val
TimeoutError
What We've Tried:
1. Loading in If we initialize MultilingualModel()
inside the entrypoint
function, the worker starts up correctly, but then we get a RuntimeError: no job context found
when the model tries to access the inference executor.
2. Loading in If we initialize MultilingualModel()
inside the prewarm
function, we get the TimeoutError
described above.
The silero.VAD.load()
model, when placed in prewarm
, loads successfully without a timeout. This suggests the issue is specific to the size or initialization requirements of the MultilingualModel
.
Our Question:
What is the recommended architectural pattern for loading a slow, heavy, context-aware model like the MultilingualModel
? Is there a way to extend the worker's startup timeout, or is there a different approach we should be taking to ensure it initializes reliably without crashing the worker?
We are running on Windows 11 with Python 3.13.
Thanks for any guidance you can provide!tall-airline-16756
08/14/2025, 8:30 AMtall-airline-16756
08/14/2025, 8:31 AMTimeoutError
you’re seeing during worker startup when initializing the MultilingualModel
usually happens because the default remote inference timeout isn’t quite long enough for the model to get set up. No worries though, here’s how you can fix it:
1. Increase the Timeout Value: You can tweak the timeout when you initialize the model. For example:
from livekit.plugins.turn_detector.multilingual import MultilingualModel
# Let’s give it a bit more time—5 seconds should do it
model = MultilingualModel()
1. Double-Check Your Configuration: Make sure the remote inference URL is reachable and working properly. If you’re using remote inference, confirm that LIVEKIT_REMOTE_EOT_URL
is set up correctly.
2. Check the STT Plugin: The MultilingualModel
depends on the STT plugin for language detection, so it’s important that the plugin is configured with the right language settings.
Once you increase the timeout and verify these settings, things should run smoothly. If you hit any snags or want to chat more about it, just let me know—I’m here to help!
Sources: LiveKit turn detector plugin | LiveKit Docs | livekit-plugins/livekit-plugins-spitch/livekit/plugins/spitch/stt.py