acceptable-dream-85180
08/09/2025, 11:28 AMtall-airline-16756
08/09/2025, 11:28 AMtall-airline-16756
08/09/2025, 11:29 AMstt_node
and grab word-level timestamps from Deepgram STT in LiveKit, here’s a simple way to do it:
1. Turn on TTS-Aligned Transcripts:
session = AgentSession(
# ... other configurations
use_tts_aligned_transcript=True,
)
2. Set up the transcription_node
Method:
class MyAgent(Agent):
async def transcription_node(
self, text: AsyncIterable[str | TimedString], model_settings: ModelSettings
) -> AsyncGenerator[str | TimedString, None]:
async for chunk in text:
if isinstance(chunk, TimedString):
# Here’s where you can access word-level timestamps
print(f"Word '{chunk.text}' starts at {chunk.start_time} and ends at {chunk.end_time}")
yield chunk
3. Optionally, Override the `stt_node`:
async def stt_node(
self, audio: AsyncIterable[rtc.AudioFrame], model_settings: ModelSettings
) -> AsyncIterable[stt.SpeechEvent]:
# Feel free to add your custom processing before STT
events = Agent.default.stt_node(self, audio, model_settings)
# Then, process each event to get those word-level timestamps
async for event in events:
if event.type == SpeechEventType.FINAL_TRANSCRIPT:
# Extract word-level info here
alternatives = event.alternatives
for alternative in alternatives:
# Loop through each word and its timing
for word in alternative.text.split():
# Handle each word and its timing as you like
pass
yield event
Just a heads up—make sure use_tts_aligned_transcript
is enabled for the most accurate timestamp data, and double-check that Deepgram STT is set up correctly in your project. If you hit any bumps, I’m here to help!acceptable-dream-85180
08/11/2025, 11:41 AMtall-airline-16756
08/11/2025, 11:42 AMev.alternatives[0].text.split(' ')
This will break the text into separate words based on spaces. Just keep in mind, punctuation or contractions might need a little extra care.
Also, if you’re looking for really precise word-level details, Deepgram’s response actually includes a words
array with each word and its timing info. That’s super handy if you want to dive deeper!
Sources: plugins/deepgram/src/stt.ts