This message was deleted LiveKit Community #ask-ai

Join Slack

This message was deleted.

# ask-ai

steep-balloon-41261

08/16/2025, 1:27 PM

This message was deleted.

tall-airline-16756

08/16/2025, 1:28 PM

Thanks for reaching out, Dylan. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

08/16/2025, 1:28 PM

Hey there! If you want to sync transcriptions with audio and get those handy utterance timestamps, the

TranscriptSynchronizer

class in the LiveKit Agents framework is your friend. Here’s a simple way to do it: • Turn on TTS-Aligned Transcription: Just set

use_tts_aligned_transcript=True

in your

AgentSession

settings to get word-level sync. • Handle the
transcription_node
Method: This deals with

TimedString

objects, which come with

start_time

and

end_time

for each word. • Make Use of the
TranscriptSynchronizer
Class: It helps sync your text with the audio playback perfectly. Just feed it timed text and audio frames using

push_text

and

push_audio

. • Wrap Up Playback: Call

mark_playback_finished

to make sure your synchronized transcript is all set once playback ends. By the way, word-level timestamps work great with ElevenLabs and Cartesia TTS providers. If you want me to walk you through more details or share some examples, just give me a shout! Sources: examples/other/text-to-speech/sync_tts_transcription.py | ElevenLabs TTS integration guide | LiveKit Docs | Cartesia TTS integration guide | LiveKit Docs

tall-airline-16756

08/16/2025, 1:29 PM

Hey! Just a quick note: the

start_time

and

end_time

TimedString

are tied to the audio playback. They’re measured in seconds and show exactly when the text should be spoken, starting from the beginning of the audio segment. If anything’s unclear or you want me to explain more, just let me know—I’m here to help! Sources: livekit-agents/livekit/agents/tts/tts.py | Text and transcriptions | LiveKit Docs

tall-airline-16756

08/16/2025, 1:30 PM

Hey there! Could you tell me which SDK or programming language you're working with? That way, I can give you the best possible guidance on how to get timestamps relative to the whole audio history. Sources: lib/src/proto/livekit_metrics.pb.dart | Sources/LiveKit/Protos/livekit_metrics.pb.swift | livekit-protocol/livekit/protocol/models.py | livekit/livekit_analytics.pb.go | InputAudioBufferSpeechStartedEvent | LiveKit Agents

Open in Slack

Previous Next