chilly-nest-87883
07/27/2025, 8:15 PMtall-airline-16756
07/27/2025, 8:15 PMtall-airline-16756
07/27/2025, 8:15 PMinterimResults
parameter.
Here’s how you can configure it:
• JavaScript:
const deepgramSTT = new STT({
interimResults: true, // Enable interim transcripts (default: true)
// other options...
});
• Python:
deepgram_stt = deepgram.STT(
interim_results=True, # Enable interim transcripts (default: True)
# other options...
)
LiveKit processes Deepgram's responses and emits two types of events:
• INTERIM_TRANSCRIPT
events for preliminary results
• FINAL_TRANSCRIPT
events for confident transcriptions
You can handle these events in your application as follows:
async for event in stt_stream:
if event.type == SpeechEventType.INTERIM_TRANSCRIPT:
# Handle interim transcript
interim_text = event.alternatives[0].text
elif event.type == SpeechEventType.FINAL_TRANSCRIPT:
# Handle final transcript
final_text = event.alternatives[0].text
This setup allows you to access real-time transcription data effectively.
Sources: agents/src/stt/stt.ts | livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py | Working with plugins | LiveKit Docs | Deepgram integration guide | LiveKit Docs | SpeechEventType | LiveKit Agentschilly-nest-87883
07/27/2025, 8:17 PMtall-airline-16756
07/27/2025, 8:18 PMfrom livekit.plugins.deepgram import stt
deepgram_stt = stt.STT(
model="nova-2-general",
interim_results=True, # Enable preliminary results
# other parameters...
)
2. Interruption Logic
The cool part is that LiveKit already has a built-in way to handle interruptions. It's like having a smart assistant that knows exactly when to pause and listen:
#interruptIfPossible() {
if (
!this.#playingSpeech ||
!this.#playingSpeech.allowInterruptions ||
this.#playingSpeech.interrupted
) {
return;
}
if (this.#opts.interruptMinWords !== 0) {
const interimWords = this.#opts.transcription.wordTokenizer.tokenize(
this.#transcribedInterimText,
);
if (interimWords.length < this.#opts.interruptMinWords) {
return;
}
}
this.#playingSpeech.interrupt();
}
3. Custom Handling
Want to get a bit more personalized? You can totally create your own custom logic for handling those interim transcripts. It's like having a tailored conversation flow:
@session.on("user_input_transcribed")
def on_transcript(transcript):
nonlocal transcript_buffer
if transcript.is_final:
logger.info(f"Received final transcript: {transcript.transcript}")
return
transcript_buffer += " " + transcript.transcript
transcript_buffer = transcript_buffer.strip()
# Check conditions for interruption based on interim results
sentence_count = count_sentences(transcript_buffer)
if sentence_count >= max_sentences:
<http://logger.info|logger.info>("Interrupting user...")
# Trigger interruption logic
asyncio.create_task(handle_interruption(interruption_ctx))
transcript_buffer = ""
4. Key Configuration Options
• interruptMinWords
: This is like your conversation sensitivity setting - how many words before you're ready to jump in
• allowInterruptions
: Your green light for when it's okay to pause the agent's speech
By putting all these pieces together, you'll create a super responsive system that can smoothly interrupt agent speech the moment something meaningful is detected. Pretty neat, right? 😊
Sources: Deepgram integration guide | LiveKit Docs | plugins/deepgram/src/stt.ts | pipeline-tts/short_replies_only.py | livekit-agents/livekit/agents/voice/agent_activity.pychilly-nest-87883
07/27/2025, 8:28 PMtall-airline-16756
07/27/2025, 8:31 PMrun_state._agent_handoff()
method, all the important conversation context gets transferred perfectly, so nothing gets lost in translation.
• The new agent picks right up where the last one left off, keeping the conversation flowing naturally.
2. Interim Transcript Processing During Handoffs
• Even during agent switches, the conversation keeps rolling - interim transcripts keep getting processed through smart event handlers.
• Cool features like on_interim_transcript()
and on_final_transcript()
make sure every word and nuance is captured, no matter who's handling the conversation.
3. Interruption Handling During Transitions
• The system is super smart about managing interruptions, checking things like word count and speech duration during agent handoffs.
• If an interruption happens mid-transition, no worries - partial transcripts are safely preserved in the chat context.
4. Real-time State Management
• Behind the scenes, LiveKit uses activity locks and scheduling magic to ensure super smooth, thread-safe handoffs.
• Any blocked tasks get cleaned up properly, so the transition is always clean and professional.
Bottom line? LiveKit makes multi-agent conversations feel totally natural and uninterrupted. It's like having a team of conversation experts who can tag in and out without anyone noticing a thing! 🌟
Sources: livekit-agents/livekit/agents/voice/run_result.py | agents/src/pipeline/pipeline_agent.ts