rough-pizza-5956
07/14/2025, 7:23 AMUserInputTranscribedEvent
is emitted when user transcription is available.
Properties
• `transcript`: str
• `is_final`: bool
• `speaker_id`: str | None - Only available if speaker diarization is supported in your STT plugin.
is there anyway to track the transcript turnwise?tall-airline-16756
07/14/2025, 7:23 AMtall-airline-16756
07/14/2025, 7:24 AM_TurnTracker
class and several turn detection mechanisms, such as VAD (Voice Activity Detection), STT endpointing, and model-based turn detection.
• Listen to UserInputTranscribedEvent
events. Use transcript
and is_final
to accumulate each user's speech into distinct turns.
• LiveKit internally manages turn boundaries and accumulates transcripts per turn, emitting final transcripts when a turn is completed.
• If your STT provider supports speaker diarization, the speaker_id
property allows attributing transcripts to individual speakers for multi-user conversations.
Turn detection mode is configurable and supports automated, VAD-based, and manual approaches, providing control over how turns are segmented and tracked.
Sources: Events and error handling | LiveKit Docs | livekit-agents/livekit/agents/voice/agent_activity.py | Turn detection and interruptions | LiveKit Docs | Speechmatics STT integration guide | LiveKit Docsrough-pizza-5956
07/14/2025, 7:27 AMtall-airline-16756
07/14/2025, 7:30 AMUserInputTranscribedEvent
. It's actually pretty cool how you can capture each speaker's turn and keep track of what's being said.
from livekit.agents import UserInputTranscribedEvent
# Maintain a list to hold each turn's transcripts
turns = []
current_turn = {
'speaker_id': None,
'transcript': ''
}
@session.on("user_input_transcribed")
def on_transcript(event: UserInputTranscribedEvent):
global current_turn
if event.is_final:
# Save the completed transcript as a new turn
current_turn['transcript'] += (' ' + event.transcript).strip()
current_turn['speaker_id'] = event.speaker_id
turns.append(current_turn.copy())
# Reset for the next turn
current_turn = {
'speaker_id': None,
'transcript': ''
}
else:
# Accumulate interim text for current turn (not yet final)
current_turn['transcript'] = event.transcript
# Optional: print current turn data
if event.speaker_id:
print(f"Speaker {event.speaker_id}: {current_turn['transcript']}")
else:
print(f"Transcript: {current_turn['transcript']}")
Basically, this code is like a friendly conversation tracker. It'll gather up each person's speech, keep an eye on who's speaking (if speaker identification is on), and neatly store each completed turn in the turns
list. Super handy for keeping track of conversations!
Sources: hardware/pi-zero-transcriber/pi_zero_transcriber.py | main.py