This message was deleted LiveKit Community #ask-ai

Join Slack

This message was deleted.

# ask-ai

steep-balloon-41261

07/13/2025, 12:23 AM

This message was deleted.

tall-airline-16756

07/13/2025, 12:23 AM

Thanks for reaching out, Rishabh. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/13/2025, 12:23 AM

You can detect user speech events on a Python server using LiveKit's built-in voice activity detection (VAD) and event system with the Realtime API. The primary workflow involves listening for speech-related events emitted from your

AgentSession

instance: • Speech started/stopped events: The session emits

input_speech_started

and

input_speech_stopped

events as users begin and finish speaking. These events can be handled using event listeners in your Python code. • User input transcription events: Use the

user_input_transcribed

event to get transcriptions of what the user said, whether interim or final:

Copy code

@session.on("user_input_transcribed")
async def on_user_input(event: UserInputTranscribedEvent):
    if event.is_final:
        print(f"User said: {event.transcript}")
    else:
        print(f"User is saying: {event.transcript}")

• VAD configuration: Server-side VAD (enabled by default) determines when a user starts/stops speaking. You can configure its parameters (like sensitivity, silence duration, etc.) when setting up your session if you need finer control. You do not need external webhooks for this purpose; LiveKit handles speech event detection and raises the relevant events directly in your Python server process using the realtime session object. Let me know if you need specific guidance for your session setup or customizing VAD parameters. Sources: Events and error handling | LiveKit Docs | livekit-agents/livekit/agents/vad.py | OpenAI Realtime API integration guide | LiveKit Docs | Turn detection | LiveKit Docs

Open in Slack

Previous Next