how can I know if the speech has been spoken to th...
# ask-ai
a
how can I know if the speech has been spoken to the user or not?
t
Thanks for reaching out, bogdanmelnikbo. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
You can determine if speech has been played to the user in LiveKit by monitoring specific playback events and states, depending on your implementation: • For LiveKit Agents (Node.js/TypeScript): ◦ Listen for events like
AGENT_STARTED_SPEAKING
,
AGENT_STOPPED_SPEAKING
,
AGENT_SPEECH_COMMITTED
(speech fully played), and
AGENT_SPEECH_INTERRUPTED
. ◦ Use the
playHandle
to check `timePlayed`—speech is only considered “heard” if it played for at least 1.5 seconds (
MIN_TIME_PLAYED_FOR_COMMIT
). ◦ Leverage
TextAudioSynchronizer
to determine what parts of text have actually been played. • For LiveKit Agents (Python): ◦ Use
SpeechHandle
and
wait_for_playout()
to determine when playback completes. ◦ Check
speech_handle.interrupted
and
speech_handle.done()
for interruption/completion status. ◦ Access synchronized transcript position via playback events. • Avatars: The agent receives an RPC from the avatar server when playback is finished, ensuring internal state is updated. These approaches let you know with precision whether speech has actually reached the user and been played out in a way that counts as “heard.” Source: official LiveKit documentation Sources: VPAEvent | LiveKit Agents | agents/src/multimodal/multimodal_agent.ts | VoicePipelineAgent | LiveKit Agents | TextAudioSynchronizer | LiveKit Agents | livekit-agents/livekit/agents/voice/agent_activity.py | Bringing AI avatars to voice agents
c
What is the proper way to use it in python? Is there example?
t
Hey there! Let me walk you through tracking speech playback in Python using LiveKit's awesome
SpeechHandle
class. I've got some super handy examples that'll make this super clear. • *Basic awaitable pattern:*`handle = session.say("Hello world") await handle # Waits for speech to finish playing print("Speech playback complete")` • *Status checking:*`if handle.done(): print("Speech has finished playing") if handle.interrupted: print("Speech was interrupted")` • *Using a callback:*`def on_speech_complete(h): print(f"Speech {h.id} done") handle = session.say("Processing your request...") handle.add_done_callback(on_speech_complete)` • *Advanced: ensure speech is played before moving on*`# Wait for the current speech to complete before ending a call current_speech = session.current_speech if current_speech: await current_speech.wait_for_playout() print("Agent finished speaking before call ended")` These patterns are super helpful – they'll let you know exactly what's happening with your speech playback, so you can smoothly handle different scenarios. Pretty cool, right? Source: official LiveKit documentation Sources: Agent speech and audio | LiveKit Docs | livekit-agents/livekit/agents/voice/events.py
c
in that case how to understand that speech has been started? Just started, not finished
t
Hey there! Want to know how to catch the moment when your agent starts speaking? I've got two super handy methods that'll help you track speech playback in Python. 😊 First up, you can listen for the
speech_created
event on your
AgentSession
. It's like a little notification that pops up right when the agent begins talking:
Copy code
from livekit.agents import AgentSession, SpeechCreatedEvent

@session.on("speech_created")
def on_speech_created(ev: SpeechCreatedEvent):
    print("Agent started speaking")
    # ev.speech_handle provides the speech handle
    # ev.user_initiated tells if it was user-initiated
    # ev.source shows the origin (e.g., 'generate_reply')
Not feeling that method? No worries! You can also track
agent_state_changed
events and check when the state switches to `speaking`:
Copy code
from livekit.agents import AgentStateChangedEvent

@session.on("agent_state_changed")
def on_agent_state_changed(ev: AgentStateChangedEvent):
    if ev.new_state == "speaking":
        print("Agent started speaking")
Either way, you'll get an instant heads-up when speech playback kicks off. Pretty cool, right? 🎉 Source: official LiveKit documentation Sources: Turn detection and interruptions | LiveKit Docs | livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py