how can I know if the speech has been spoken to the user or LiveKit Community #ask-ai

Join Slack

how can I know if the speech has been spoken to th...

# ask-ai

adamant-airport-69140

07/10/2025, 2:28 PM

how can I know if the speech has been spoken to the user or not?

tall-airline-16756

07/10/2025, 2:28 PM

Thanks for reaching out, bogdanmelnikbo. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/10/2025, 2:29 PM

You can determine if speech has been played to the user in LiveKit by monitoring specific playback events and states, depending on your implementation: • For LiveKit Agents (Node.js/TypeScript): ◦ Listen for events like

AGENT_STARTED_SPEAKING

AGENT_STOPPED_SPEAKING

AGENT_SPEECH_COMMITTED

(speech fully played), and

AGENT_SPEECH_INTERRUPTED

. ◦ Use the

playHandle

to check `timePlayed`—speech is only considered “heard” if it played for at least 1.5 seconds (

MIN_TIME_PLAYED_FOR_COMMIT

). ◦ Leverage

TextAudioSynchronizer

to determine what parts of text have actually been played. • For LiveKit Agents (Python): ◦ Use

SpeechHandle

and

wait_for_playout()

to determine when playback completes. ◦ Check

speech_handle.interrupted

and

speech_handle.done()

for interruption/completion status. ◦ Access synchronized transcript position via playback events. • Avatars: The agent receives an RPC from the avatar server when playback is finished, ensuring internal state is updated. These approaches let you know with precision whether speech has actually reached the user and been played out in a way that counts as “heard.” Source: official LiveKit documentation Sources: VPAEvent | LiveKit Agents | agents/src/multimodal/multimodal_agent.ts | VoicePipelineAgent | LiveKit Agents | TextAudioSynchronizer | LiveKit Agents | livekit-agents/livekit/agents/voice/agent_activity.py | Bringing AI avatars to voice agents

clean-mouse-53726

07/11/2025, 8:42 AM

What is the proper way to use it in python? Is there example?

tall-airline-16756

07/11/2025, 8:43 AM

Hey there! Let me walk you through tracking speech playback in Python using LiveKit's awesome

SpeechHandle

class. I've got some super handy examples that'll make this super clear. • *Basic awaitable pattern:*`handle = session.say("Hello world") await handle # Waits for speech to finish playing print("Speech playback complete")` • *Status checking:*`if handle.done(): print("Speech has finished playing") if handle.interrupted: print("Speech was interrupted")` • *Using a callback:*`def on_speech_complete(h): print(f"Speech {h.id} done") handle = session.say("Processing your request...") handle.add_done_callback(on_speech_complete)` • *Advanced: ensure speech is played before moving on*`# Wait for the current speech to complete before ending a call current_speech = session.current_speech if current_speech: await current_speech.wait_for_playout() print("Agent finished speaking before call ended")` These patterns are super helpful – they'll let you know exactly what's happening with your speech playback, so you can smoothly handle different scenarios. Pretty cool, right? Source: official LiveKit documentation Sources: Agent speech and audio | LiveKit Docs | livekit-agents/livekit/agents/voice/events.py

clean-mouse-53726

07/11/2025, 9:01 AM

in that case how to understand that speech has been started? Just started, not finished

tall-airline-16756

07/11/2025, 9:02 AM

Hey there! Want to know how to catch the moment when your agent starts speaking? I've got two super handy methods that'll help you track speech playback in Python. 😊 First up, you can listen for the

speech_created

event on your

AgentSession

. It's like a little notification that pops up right when the agent begins talking:

Copy code

from livekit.agents import AgentSession, SpeechCreatedEvent

@session.on("speech_created")
def on_speech_created(ev: SpeechCreatedEvent):
    print("Agent started speaking")
    # ev.speech_handle provides the speech handle
    # ev.user_initiated tells if it was user-initiated
    # ev.source shows the origin (e.g., 'generate_reply')

Not feeling that method? No worries! You can also track

agent_state_changed

events and check when the state switches to `speaking`:

Copy code

from livekit.agents import AgentStateChangedEvent

@session.on("agent_state_changed")
def on_agent_state_changed(ev: AgentStateChangedEvent):
    if ev.new_state == "speaking":
        print("Agent started speaking")

Either way, you'll get an instant heads-up when speech playback kicks off. Pretty cool, right? 🎉 Source: official LiveKit documentation Sources: Turn detection and interruptions | LiveKit Docs | livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py

Open in Slack

Previous Next