How can stop processing audio already generated in datstream LiveKit Community #ask-ai

Join Slack

How can stop processing audio already generated in...

# ask-ai

salmon-elephant-7819

09/11/2025, 9:50 AM

How can stop processing audio already generated in datstream if user start speaking?

tall-airline-16756

09/11/2025, 9:50 AM

Thanks for reaching out, Ankit. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

09/11/2025, 9:51 AM

I’m glad you asked about stopping audio processing when a user starts speaking! With LiveKit, you have some great options to manage interruptions smoothly. Here are the main approaches you can try: • Manual Control: You can use

session.interrupt()

to pause the current agent speech and

session.input.set_audio_enabled(True/False)

to toggle listening on or off. This works really well if you’re aiming for push-to-talk style interactions. • Automatic Interruption: By enabling Voice Activity Detection (VAD) and setting

allowInterruptions

to true, the agent will automatically pause its speech whenever it detects the user speaking. It’s a handy hands-free option! • Server-Side Interruption: For real-time models, interruptions are managed on the server side, which kicks in as soon as the server recognizes that the user has started talking. If you’d like, I’m happy to walk you through the detailed implementation steps for any of these methods—just let me know!

salmon-elephant-7819

09/11/2025, 10:35 AM

This is for Avatar worker and I am using DataStreamAudioOutput to publish audio from avatar to baceknd worker. Now the problem is we don’t know when audio stream should stop passing audio if user stand speaking.

tall-airline-16756

09/11/2025, 10:36 AM

To gracefully stop

DataStreamAudioOutput

in an Avatar worker when a user starts speaking, here’s a helpful approach you can try: • Use `session.interrupt()`: This will pause any ongoing agent speech and stop the audio stream being sent through

DataStreamAudioOutput

. It also clears the output buffer, which keeps things tidy. • Detect User Speech: You can use Voice Activity Detection (VAD) or events like

user_started_speaking

to know exactly when the user begins talking and then trigger

session.interrupt()

. • Monitor Interruption: Keep an eye on the

playback_finished

event on the avatar worker to confirm the stream was successfully interrupted. This event includes an

interrupted

flag for your reference. Just a quick note:

DataStreamAudioOutput

doesn’t support manual pause/resume, so using

session.interrupt()

is the best way to handle this. If you’d like, I’m here to help with detailed implementation steps or any other questions you might have!

salmon-elephant-7819

09/11/2025, 10:38 AM

Yes please give me implementation

tall-airline-16756

09/11/2025, 10:39 AM

Here’s a helpful example to guide you through stopping

DataStreamAudioOutput

in an Avatar worker when a user begins speaking: 1. Initialize the Avatar Worker

Copy code

from livekit import DataStreamAudioOutput

# Initialize DataStreamAudioOutput
avatar_audio_output = DataStreamAudioOutput(
    room=room,  # Your LiveKit room instance
    destination_identity="avatar_worker",  # Identity of the avatar worker
    sample_rate=16000  # Set the sample rate
)

# Assign the audio output to the session
session.output.audio = avatar_audio_output

2. Detect User Speech Here, you can use Voice Activity Detection (VAD) or an event listener to catch when the user starts speaking:

Copy code

@agent.on("user_started_speaking")
def handle_user_speaking():
    session.interrupt()  # Stop the audio stream immediately
    session.input.set_audio_enabled(True)  # Enable user input

3. Handle Playback Interruption It's good to listen for the

playback_finished

event to confirm the stream was indeed interrupted:

Copy code

@session.output.audio.on("playback_finished")
def on_playback_finished(event):
    if event.interrupted:
        print("Audio stream was interrupted by user speech.")

4. Stop and Resume Audio Stream Once the user stops speaking, you can disable input and process the audio they provided:

Copy code

@agent.on("user_stopped_speaking")
def handle_user_stopped():
    session.input.set_audio_enabled(False)  # Stop listening to user
    session.commit_user_turn()  # Process the user’s input

This approach helps ensure the audio stream to the avatar worker pauses right when the user starts talking and then smoothly resumes once they're done. If anything isn’t clear or you’d like more examples, just let me know—I’m here to help!

Open in Slack

Previous Next