adamant-airport-69140
07/10/2025, 2:28 PMtall-airline-16756
07/10/2025, 2:28 PMtall-airline-16756
07/10/2025, 2:29 PMAGENT_STARTED_SPEAKING
, AGENT_STOPPED_SPEAKING
, AGENT_SPEECH_COMMITTED
(speech fully played), and AGENT_SPEECH_INTERRUPTED
.
◦ Use the playHandle
to check `timePlayed`—speech is only considered “heard” if it played for at least 1.5 seconds (MIN_TIME_PLAYED_FOR_COMMIT
).
◦ Leverage TextAudioSynchronizer
to determine what parts of text have actually been played.
• For LiveKit Agents (Python):
◦ Use SpeechHandle
and wait_for_playout()
to determine when playback completes.
◦ Check speech_handle.interrupted
and speech_handle.done()
for interruption/completion status.
◦ Access synchronized transcript position via playback events.
• Avatars: The agent receives an RPC from the avatar server when playback is finished, ensuring internal state is updated.
These approaches let you know with precision whether speech has actually reached the user and been played out in a way that counts as “heard.”
Source: official LiveKit documentation
Sources: VPAEvent | LiveKit Agents | agents/src/multimodal/multimodal_agent.ts | VoicePipelineAgent | LiveKit Agents | TextAudioSynchronizer | LiveKit Agents | livekit-agents/livekit/agents/voice/agent_activity.py | Bringing AI avatars to voice agentsclean-mouse-53726
07/11/2025, 8:42 AMtall-airline-16756
07/11/2025, 8:43 AMSpeechHandle
class. I've got some super handy examples that'll make this super clear.
• *Basic awaitable pattern:*`handle = session.say("Hello world")
await handle # Waits for speech to finish playing
print("Speech playback complete")`
• *Status checking:*`if handle.done():
print("Speech has finished playing")
if handle.interrupted:
print("Speech was interrupted")`
• *Using a callback:*`def on_speech_complete(h):
print(f"Speech {h.id} done")
handle = session.say("Processing your request...")
handle.add_done_callback(on_speech_complete)`
• *Advanced: ensure speech is played before moving on*`# Wait for the current speech to complete before ending a call
current_speech = session.current_speech
if current_speech:
await current_speech.wait_for_playout()
print("Agent finished speaking before call ended")`
These patterns are super helpful – they'll let you know exactly what's happening with your speech playback, so you can smoothly handle different scenarios. Pretty cool, right?
Source: official LiveKit documentation
Sources: Agent speech and audio | LiveKit Docs | livekit-agents/livekit/agents/voice/events.pyclean-mouse-53726
07/11/2025, 9:01 AMtall-airline-16756
07/11/2025, 9:02 AMspeech_created
event on your AgentSession
. It's like a little notification that pops up right when the agent begins talking:
from livekit.agents import AgentSession, SpeechCreatedEvent
@session.on("speech_created")
def on_speech_created(ev: SpeechCreatedEvent):
print("Agent started speaking")
# ev.speech_handle provides the speech handle
# ev.user_initiated tells if it was user-initiated
# ev.source shows the origin (e.g., 'generate_reply')
Not feeling that method? No worries! You can also track agent_state_changed
events and check when the state switches to `speaking`:
from livekit.agents import AgentStateChangedEvent
@session.on("agent_state_changed")
def on_agent_state_changed(ev: AgentStateChangedEvent):
if ev.new_state == "speaking":
print("Agent started speaking")
Either way, you'll get an instant heads-up when speech playback kicks off. Pretty cool, right? 🎉
Source: official LiveKit documentation
Sources: Turn detection and interruptions | LiveKit Docs | livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py