salmon-elephant-7819
09/11/2025, 9:50 AMtall-airline-16756
09/11/2025, 9:50 AMtall-airline-16756
09/11/2025, 9:51 AMsession.interrupt()
to pause the current agent speech and session.input.set_audio_enabled(True/False)
to toggle listening on or off. This works really well if you’re aiming for push-to-talk style interactions.
• Automatic Interruption: By enabling Voice Activity Detection (VAD) and setting allowInterruptions
to true, the agent will automatically pause its speech whenever it detects the user speaking. It’s a handy hands-free option!
• Server-Side Interruption: For real-time models, interruptions are managed on the server side, which kicks in as soon as the server recognizes that the user has started talking.
If you’d like, I’m happy to walk you through the detailed implementation steps for any of these methods—just let me know!salmon-elephant-7819
09/11/2025, 10:35 AMtall-airline-16756
09/11/2025, 10:36 AMDataStreamAudioOutput
in an Avatar worker when a user starts speaking, here’s a helpful approach you can try:
• Use `session.interrupt()`: This will pause any ongoing agent speech and stop the audio stream being sent through DataStreamAudioOutput
. It also clears the output buffer, which keeps things tidy.
• Detect User Speech: You can use Voice Activity Detection (VAD) or events like user_started_speaking
to know exactly when the user begins talking and then trigger session.interrupt()
.
• Monitor Interruption: Keep an eye on the playback_finished
event on the avatar worker to confirm the stream was successfully interrupted. This event includes an interrupted
flag for your reference.
Just a quick note: DataStreamAudioOutput
doesn’t support manual pause/resume, so using session.interrupt()
is the best way to handle this. If you’d like, I’m here to help with detailed implementation steps or any other questions you might have!salmon-elephant-7819
09/11/2025, 10:38 AMtall-airline-16756
09/11/2025, 10:39 AMDataStreamAudioOutput
in an Avatar worker when a user begins speaking:
1. Initialize the Avatar Worker
from livekit import DataStreamAudioOutput
# Initialize DataStreamAudioOutput
avatar_audio_output = DataStreamAudioOutput(
room=room, # Your LiveKit room instance
destination_identity="avatar_worker", # Identity of the avatar worker
sample_rate=16000 # Set the sample rate
)
# Assign the audio output to the session
session.output.audio = avatar_audio_output
2. Detect User Speech
Here, you can use Voice Activity Detection (VAD) or an event listener to catch when the user starts speaking:
@agent.on("user_started_speaking")
def handle_user_speaking():
session.interrupt() # Stop the audio stream immediately
session.input.set_audio_enabled(True) # Enable user input
3. Handle Playback Interruption
It's good to listen for the playback_finished
event to confirm the stream was indeed interrupted:
@session.output.audio.on("playback_finished")
def on_playback_finished(event):
if event.interrupted:
print("Audio stream was interrupted by user speech.")
4. Stop and Resume Audio Stream
Once the user stops speaking, you can disable input and process the audio they provided:
@agent.on("user_stopped_speaking")
def handle_user_stopped():
session.input.set_audio_enabled(False) # Stop listening to user
session.commit_user_turn() # Process the user’s input
This approach helps ensure the audio stream to the avatar worker pauses right when the user starts talking and then smoothly resumes once they're done. If anything isn’t clear or you’d like more examples, just let me know—I’m here to help!