I am using Deepgram for STT. For shorter sentences...
# ask-ai
l
I am using Deepgram for STT. For shorter sentences (1-2 words), is_final is sent after ~2 seconds. I have set
Copy code
endpointing_ms=25
t
Thanks for reaching out, Deepak. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Even with
endpointing_ms=25
, the delay in receiving
is_final
for 1-2 word utterances is affected by several factors beyond the Deepgram endpointing threshold: • Audio chunk size: LiveKit processes audio in chunks (typically 50ms or 100ms), so endpointing values smaller than these may not noticeably reduce delay. • Turn detection: LiveKit uses its own turn detection system, which can apply added delays after endpointing is detected (e.g., a default minimum delay of 0.5s), so
is_final
may not emit immediately even after Deepgram signals an endpoint. • Processing logic: State management, alternative result handling, and whether
no_delay
is set when using
smart_format
can influence timing. For short utterances, these mechanisms may introduce further delay since the system waits to ensure that the phrase is complete. To minimize delays for short utterances: • Enable
no_delay: true
if using
smart_format
. • Consider reducing LiveKit's
min_endpointing_delay
to match your application's latency needs. • Ensure your
endpointing_ms
setting is not significantly below your audio chunk size. These adjustments help—but some delay is expected to avoid false endpointing for very brief utterances. Further tuning might be necessary for your specific use case. Sources: livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py | plugins/deepgram/src/stt.ts | How does end-of-utterance detection work in conversations?