user_input_transcribed A `UserInputTranscribedEven...
# ask-ai
r
user_input_transcribed A
UserInputTranscribedEvent
is emitted when user transcription is available. Properties • `transcript`: str • `is_final`: bool • `speaker_id`: str | None - Only available if speaker diarization is supported in your STT plugin. is there anyway to track the transcript turnwise?
t
Thanks for reaching out, Riddhi. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Yes, you can track transcripts turn-wise in LiveKit using built-in turn detection and transcript management systems. LiveKit handles turn segmentation through its
_TurnTracker
class and several turn detection mechanisms, such as VAD (Voice Activity Detection), STT endpointing, and model-based turn detection. • Listen to
UserInputTranscribedEvent
events. Use
transcript
and
is_final
to accumulate each user's speech into distinct turns. • LiveKit internally manages turn boundaries and accumulates transcripts per turn, emitting final transcripts when a turn is completed. • If your STT provider supports speaker diarization, the
speaker_id
property allows attributing transcripts to individual speakers for multi-user conversations. Turn detection mode is configurable and supports automated, VAD-based, and manual approaches, providing control over how turns are segmented and tracked. Sources: Events and error handling | LiveKit Docs | livekit-agents/livekit/agents/voice/agent_activity.py | Turn detection and interruptions | LiveKit Docs | Speechmatics STT integration guide | LiveKit Docs
r
give code
t
Hey there! Let me walk you through a neat way to track conversation transcripts in LiveKit using the
UserInputTranscribedEvent
. It's actually pretty cool how you can capture each speaker's turn and keep track of what's being said.
Copy code
from livekit.agents import UserInputTranscribedEvent

# Maintain a list to hold each turn's transcripts
turns = []
current_turn = {
    'speaker_id': None,
    'transcript': ''
}

@session.on("user_input_transcribed")
def on_transcript(event: UserInputTranscribedEvent):
    global current_turn
    if event.is_final:
        # Save the completed transcript as a new turn
        current_turn['transcript'] += (' ' + event.transcript).strip()
        current_turn['speaker_id'] = event.speaker_id
        turns.append(current_turn.copy())
        # Reset for the next turn
        current_turn = {
            'speaker_id': None,
            'transcript': ''
        }
    else:
        # Accumulate interim text for current turn (not yet final)
        current_turn['transcript'] = event.transcript
    # Optional: print current turn data
    if event.speaker_id:
        print(f"Speaker {event.speaker_id}: {current_turn['transcript']}")
    else:
        print(f"Transcript: {current_turn['transcript']}")
Basically, this code is like a friendly conversation tracker. It'll gather up each person's speech, keep an eye on who's speaking (if speaker identification is on), and neatly store each completed turn in the
turns
list. Super handy for keeping track of conversations! Sources: hardware/pi-zero-transcriber/pi_zero_transcriber.py | main.py