what is server side and client side in audio processing LiveKit Community #ask-ai

Join Slack

what is server side and client side in audio proce...

# ask-ai

big-controller-60015

07/14/2025, 10:14 AM

what is server side and client side in audio processing

tall-airline-16756

07/14/2025, 10:15 AM

Thanks for reaching out, Mohammed. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/14/2025, 10:15 AM

Server-side audio processing means that audio is processed on LiveKit servers or agents. This typically includes features like voice activity detection (VAD), speech-to-text, language model processing, and noise cancellation. These are applied after audio is sent from the client, and server-side VAD is enabled by default—you can fine-tune its settings to match your needs, and having noise cancellation on the server means you should disable it on the client to avoid unexpected issues. Client-side audio processing happens directly on the user’s device before audio is transmitted. This allows you to manually manage features like VAD (for example, implementing a push-to-talk interface where you control when the user is speaking). Client-side processing gives you more manual control, but generally, server-side is preferred for comprehensive and consistent processing, especially for AI-powered features. The main difference is where the audio processing occurs—on the user’s device before sending, or on the server after receiving. This affects control, quality, and which features are easiest to implement. Let me know if you’d like specific details or configuration guidance for your application. Sources: Voice Assistants | livekit/livekit_composite | DeepWiki | Turn detection | LiveKit Docs | AI noise cancellation | LiveKit Docs | Noise & echo cancellation | LiveKit Docs | Live Conversations with AI using ChatGPT and WebRTC

big-controller-60015

07/14/2025, 10:19 AM

can client side processing be done for telephony calls

tall-airline-16756

07/14/2025, 10:20 AM

Hey there! Let's talk about how LiveKit makes client-side audio processing super flexible for telephony calls. You've got some really cool options: • Think of processing delegates (like the

AudioCustomProcessingDelegate

in the Swift SDK) as your audio's personal stylist - they can tweak and transform your audio in real-time before it's sent. Wanna add some effects or do some creative filtering? You're totally covered! • Need a smart way to manage when audio gets transmitted? Client-side Voice Activity Detection (VAD) has your back, making push-to-talk features a breeze. • Noise cancellation is totally doable on the client-side, but pro tip: if you're also using LiveKit's server-side noise cancellation (through Voice Pipeline Agents), make sure to disable client-side processing to avoid any audio wrestling matches. The best part? You get to be the DJ of your audio processing - choose what happens on the client, the server, or mix it up between both! Need some personalized advice for your specific setup? Just give me a shout, and I'll help you navigate which platform and approach works best for you. Sources: Sources/LiveKit/Protocols/AudioCustomProcessingDelegate.swift | Speech & Audio Processing | livekit/livekit_composite | DeepWiki | Turn detection | LiveKit Docs | Voice Assistants | livekit/livekit_composite | DeepWiki

Open in Slack

Previous Next