purple-barista-60143
09/10/2025, 1:10 PMtall-airline-16756
09/10/2025, 1:10 PMtall-airline-16756
09/10/2025, 1:11 PMuse_tts_aligned_transcript
feature in your agent configuration. This will give you word- or sentence-level timestamps for the agent's speech, which is super helpful.
3. Access Timed Transcripts: Use the transcription_node
method to get precise timing for each word or sentence. This helps you map the session history right onto the audio timeline.
4. Export Session History: Save the session history with timestamps using session.history.to_dict()
. This captures all messages along with their timing metadata.
5. Align Timestamps: Finally, match the timestamps from the session history (like start_time
from TimedString
) with the audio recording’s timeline to keep everything perfectly synchronized.
Also, for real-time updates, you can listen to events such as conversation_item_added
or user_input_transcribed
, which include timestamps for each message.
If you have any questions or need a hand with any part of this, just let me know—I’m here to help!
Sources: Text and transcriptions | LiveKit Docs | Session recording and transcripts | LiveKit Docs | Text and transcriptions | LiveKit Docspurple-barista-60143
09/10/2025, 1:12 PMtall-airline-16756
09/10/2025, 1:13 PMfrom livekit import api
async def start_audio_recording():
req = api.RoomCompositeEgressRequest(
room_name="example-room",
audio_only=True,
file_outputs=[api.EncodedFileOutput(
file_type=api.EncodedFileType.OGG,
filepath="recordings/session.ogg",
s3=api.S3Upload(
bucket="my-bucket",
region="us-west-2",
access_key="my-access-key",
secret="my-secret-key",
),
)],
)
lkapi = api.LiveKitAPI()
await lkapi.egress.start_room_composite_egress(req)
await lkapi.aclose()
2. Enable TTS-Aligned Transcription
from livekit import AgentSession
session = AgentSession(
stt=my_stt,
llm=my_llm,
tts=my_tts,
use_tts_aligned_transcript=True, # This enables word/sentence-level timestamps
)
3. Access Timed Transcripts
async def transcription_node(self, text, model_settings):
async for chunk in text:
if isinstance(chunk, TimedString):
print(f"[{chunk.start_time:.2f}s - {chunk.end_time:.2f}s] {chunk.text}")
yield chunk
4. Export Session History
async def save_session_history():
import json
from datetime import datetime
filename = f"/tmp/transcript_{session.room_name}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
with open(filename, 'w') as f:
json.dump(session.history.to_dict(), f, indent=2)
Please feel free to reach out if you'd like me to clarify any part or provide more examples—I’m here to help!purple-barista-60143
09/10/2025, 1:14 PMtall-airline-16756
09/10/2025, 1:15 PMTimedString
objects to session history, you can use the timestamps provided by TimedString
(e.g., start_time
and end_time
) and align them with the created_at
timestamps in the session history. Here's a friendly guide on how you can do it:
1. Extract TimedString Data: You can use the transcription_node
method to capture TimedString
objects along with their timing information. Here’s a little snippet to help you out:`async def transcription_node(self, text, model_settings):
async for chunk in text:
if isinstance(chunk, TimedString):
print(f"[{chunk.start_time:.2f}s - {chunk.end_time:.2f}s] {chunk.text}")
# Store or process the timing data here
yield chunk`
2. Retrieve Session History: Next, you’ll want to access the session history using session.history.to_dict()
. This will give you all the messages along with their created_at
timestamps. Here’s how you can do that:`history = session.history.to_dict()
for item in history['items']:
print(f"[{item['created_at']}] {item['role']}: {item['text_content']}")`
3. Align TimedString with Session History: Finally, you can compare the start_time
and end_time
of each TimedString
with the created_at
timestamps in the session history. A simple matching algorithm can help find the closest timestamps. Here’s an example:`for item in history['items']:
for timed_string in timed_strings: # List of TimedString objects
if abs(timed_string.start_time - item['created_at']) < threshold:
print(f"Matched: {item['text_content']} with {timed_string.text}")`
If you have any questions or want me to walk you through any part of this, please don’t hesitate to ask. I’m here to help!