How to get the speech from Agent just after its said by the LiveKit Community #ask-ai

Join Slack

How to get the speech from Agent just after its sa...

# ask-ai

lively-horse-90722

08/13/2025, 7:33 AM

How to get the speech from Agent just after its said by the agent

tall-airline-16756

08/13/2025, 7:33 AM

Thanks for reaching out, Deepak. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

08/13/2025, 7:34 AM

If you want to grab the speech from an agent right after it’s spoken in LiveKit, you can use the

SpeechHandle

object along with its handy events. Here’s a simple way to do it: 1. Get the
SpeechHandle
object: You’ll get this from methods like

say()

generate_reply()

. 2. Use the
done()
method or `add_done_callback()`: These help you know exactly when the speech has finished. Example Code (Python):

Copy code

# Get the SpeechHandle
speech_handle = await agent_session.say("Hello, how can I help you today?")

# Wait for the speech to finish
await speech_handle.done()

# Or set up a callback
speech_handle.add_done_callback(lambda: print("Speech completed"))

Important Methods and Events: • `speech_handle.done()`: An async method that pauses until the speech is done. • `speech_handle.add_done_callback()`: Lets you register a function to run right after the speech finishes. Extra Events (JavaScript):

Copy code

agent.on(VPAEvent.AGENT_STOPPED_SPEAKING, () => {
    console.log("Speech has stopped");
});

This way, you can jump in with your code as soon as the agent stops talking. If you have any questions or want to dive deeper, just let me know—I’m here to help!

lively-horse-90722

08/13/2025, 7:43 AM

how to print the spoken text also

tall-airline-16756

08/13/2025, 7:43 AM

Hey there! If you want to print out the spoken text from an agent in LiveKit, you can totally use the transcription feature. Let me walk you through how to do it: 1. Set up a transcription stream handler: You’ll want to create a handler that processes transcription events. Here’s a simple example in Python:

Copy code

from dataclasses import dataclass
from typing import Optional
from livekit import rtc
from livekit.agents import utils

@dataclass
class TextChunk:
    stream_id: str
    participant_identity: str
    track_id: Optional[str]
    segment_id: str
    content: str
    final: Optional[bool]

class TextStreamPrinter:
    def __init__(self):
        self._text_chunk_queue = utils.Chan[TextChunk | None]()
        self.running = True
        self._current_segment_id = None

    async def _main_task(self):
        while self.running:
            chunk = await self._text_chunk_queue.recv()
            if chunk is None:
                break
            if chunk.final:
                print(f"Final transcription: {chunk.content}")
            else:
                print(f"Interim transcription: {chunk.content}")
            self._current_segment_id = chunk.segment_id

    async def on_text_received(self, reader: rtc.TextStreamReader, participant_identity: str):
        async for chunk in reader:
            await self._text_chunk_queue.send(
                TextChunk(
                    stream_id=reader.info.stream_id,
                    participant_identity=participant_identity,
                    track_id=reader.info.attributes.get("lk.transcription_track_id"),
                    segment_id=chunk.info.segment_id,
                    content=chunk.text,
                    final=chunk.info.final,
                )
            )

2. Register the handler: Next, connect your handler so it listens for transcription events like this:

Copy code

# Assuming room is your LiveKit room instance
text_printer = TextStreamPrinter()
room.register_text_stream_handler(
    topic="lk.transcription",
    handler=text_printer.on_text_received
)

3. Start and run the task: Finally, kick off the transcription handling task:

Copy code

async def run():
    await text_printer._main_task()

asyncio.create_task(run())

And that’s it! This will print out both the interim and final transcriptions from the agent’s speech as it happens. If you have any questions or want me to explain anything more, just give me a shout—I’m here to help!

lively-horse-90722

08/13/2025, 7:44 AM

after its spoken fully, I want the spoken text

tall-airline-16756

08/13/2025, 7:45 AM

If you want to grab the final spoken text after the agent finishes speaking in LiveKit, you can use the

TextAudioSynchronizer

and listen for the

textUpdated

event. Here's a simple way to do it: 1. Use the `TextAudioSynchronizer`:

Copy code

const synchronizer = new TextAudioSynchronizer();
synchronizer.on('textUpdated', (text) => {
  if (text.final) {
    const finalText = text.text;
    // Handle the final spoken text here
  }
});

2. Check out the
playedText
property: This gives you the text that’s been played so far, including the final text once speaking wraps up:

Copy code

const finalText = synchronizer.playedText;

Just a couple of things to keep in mind: • The

textUpdated

event fires both during updates and when the final text is ready. Make sure to check the

final

property to confirm it’s complete. • Double-check that your

TextAudioSynchronizer

is properly set up and connected to the audio playback so everything stays in sync. If you’d like, I’m here to help you get this all set up smoothly!

Open in Slack

Previous Next