user input transcribed A `UserInputTranscribedEvent` is emit LiveKit Community #ask-ai

user_input_transcribed A `UserInputTranscribedEven...

rough-pizza-5956

07/14/2025, 7:23 AM

user_input_transcribed A

UserInputTranscribedEvent

is emitted when user transcription is available. Properties • `transcript`: str • `is_final`: bool • `speaker_id`: str | None - Only available if speaker diarization is supported in your STT plugin. is there anyway to track the transcript turnwise?

tall-airline-16756

07/14/2025, 7:23 AM

Thanks for reaching out, Riddhi. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/14/2025, 7:24 AM

Yes, you can track transcripts turn-wise in LiveKit using built-in turn detection and transcript management systems. LiveKit handles turn segmentation through its

_TurnTracker

class and several turn detection mechanisms, such as VAD (Voice Activity Detection), STT endpointing, and model-based turn detection. • Listen to

UserInputTranscribedEvent

events. Use

transcript

and

is_final

to accumulate each user's speech into distinct turns. • LiveKit internally manages turn boundaries and accumulates transcripts per turn, emitting final transcripts when a turn is completed. • If your STT provider supports speaker diarization, the

speaker_id

property allows attributing transcripts to individual speakers for multi-user conversations. Turn detection mode is configurable and supports automated, VAD-based, and manual approaches, providing control over how turns are segmented and tracked. Sources: Events and error handling | LiveKit Docs | livekit-agents/livekit/agents/voice/agent_activity.py | Turn detection and interruptions | LiveKit Docs | Speechmatics STT integration guide | LiveKit Docs

rough-pizza-5956

07/14/2025, 7:27 AM

give code

tall-airline-16756

07/14/2025, 7:30 AM

Hey there! Let me walk you through a neat way to track conversation transcripts in LiveKit using the

UserInputTranscribedEvent

. It's actually pretty cool how you can capture each speaker's turn and keep track of what's being said.

Copy code

from livekit.agents import UserInputTranscribedEvent

# Maintain a list to hold each turn's transcripts
turns = []
current_turn = {
    'speaker_id': None,
    'transcript': ''
}

@session.on("user_input_transcribed")
def on_transcript(event: UserInputTranscribedEvent):
    global current_turn
    if event.is_final:
        # Save the completed transcript as a new turn
        current_turn['transcript'] += (' ' + event.transcript).strip()
        current_turn['speaker_id'] = event.speaker_id
        turns.append(current_turn.copy())
        # Reset for the next turn
        current_turn = {
            'speaker_id': None,
            'transcript': ''
        }
    else:
        # Accumulate interim text for current turn (not yet final)
        current_turn['transcript'] = event.transcript
    # Optional: print current turn data
    if event.speaker_id:
        print(f"Speaker {event.speaker_id}: {current_turn['transcript']}")
    else:
        print(f"Transcript: {current_turn['transcript']}")

Basically, this code is like a friendly conversation tracker. It'll gather up each person's speech, keep an eye on who's speaking (if speaker identification is on), and neatly store each completed turn in the

turns

list. Super handy for keeping track of conversations! Sources: hardware/pi-zero-transcriber/pi_zero_transcriber.py | main.py

Open in Slack

Previous Next