This message was deleted.
# ask-ai
s
This message was deleted.
t
Thanks for reaching out, Dylan. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Hey there! If you want to sync transcriptions with audio and get those handy utterance timestamps, the
TranscriptSynchronizer
class in the LiveKit Agents framework is your friend. Here’s a simple way to do it: • Turn on TTS-Aligned Transcription: Just set
use_tts_aligned_transcript=True
in your
AgentSession
settings to get word-level sync. • Handle the
transcription_node
Method:
This deals with
TimedString
objects, which come with
start_time
and
end_time
for each word. • Make Use of the
TranscriptSynchronizer
Class:
It helps sync your text with the audio playback perfectly. Just feed it timed text and audio frames using
push_text
and
push_audio
. • Wrap Up Playback: Call
mark_playback_finished
to make sure your synchronized transcript is all set once playback ends. By the way, word-level timestamps work great with ElevenLabs and Cartesia TTS providers. If you want me to walk you through more details or share some examples, just give me a shout! Sources: examples/other/text-to-speech/sync_tts_transcription.py | ElevenLabs TTS integration guide | LiveKit Docs | Cartesia TTS integration guide | LiveKit Docs
Hey! Just a quick note: the
start_time
and
end_time
in
TimedString
are tied to the audio playback. They’re measured in seconds and show exactly when the text should be spoken, starting from the beginning of the audio segment. If anything’s unclear or you want me to explain more, just let me know—I’m here to help! Sources: livekit-agents/livekit/agents/tts/tts.py | Text and transcriptions | LiveKit Docs
Hey there! Could you tell me which SDK or programming language you're working with? That way, I can give you the best possible guidance on how to get timestamps relative to the whole audio history. Sources: lib/src/proto/livekit_metrics.pb.dart | Sources/LiveKit/Protos/livekit_metrics.pb.swift | livekit-protocol/livekit/protocol/models.py | livekit/livekit_analytics.pb.go | InputAudioBufferSpeechStartedEvent | LiveKit Agents