I am using AWS for stt, llm and tts and get very c...
# ask-ai
g
I am using AWS for stt, llm and tts and get very choppy audio output. Agent is running locally. If I swap the stt to another provider audio isn't choppy anymore
t
Thanks for reaching out, Oliver. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Could you share a bit more about your setup? Just to get a clearer picture: • Are you using AWS services for all three parts (STT, LLM, and TTS), or just some of them? • Which AWS service are you using for STT? • Have you noticed any error messages or logs when the audio gets choppy? • Do you know what the network latency is between your local agent and the AWS services? Having these details will really help us figure out what’s going on and find the best solution for you.
g
All three. The stt is AWS transcribe. No errors. Don't know the latency, feel free to help find that out
t
Hey there! If you’re noticing choppy audio when using AWS Transcribe for STT, it can be due to a few different things. Let’s walk through some steps that might help smooth things out: • Check Your Network Quality: Make sure you’re using WebRTC for your client connections. Sometimes, if the network isn’t great, packets can get lost, which leads to that choppy sound. • Turn On Noise and Echo Cancellation: If you’re on LiveKit Cloud, try using their enhanced noise cancellation feature. It can really help clean up the input audio and might even make the output feel more in sync. • Look at How Audio is Streamed: It’s best if your TTS system streams audio bit by bit—like sentence by sentence—instead of sending big chunks all at once. That usually makes things smoother. • Keep an Eye on Session Stability: AWS Transcribe closes sessions that are idle for 15 seconds, so make sure your sessions restart properly without interrupting the audio flow. • Double-Check Your Audio Settings: Verify that your sample rate and encoding match what’s expected (usually 24,000 Hz and PCM encoding). If you want, I’m happy to dig in deeper and review your agent code or configuration with you. Just let me know how you’re handling TTS audio output (streaming or bulk), whether you’re using LiveKit Cloud, and a bit about your client’s network setup (mobile, web, desktop, etc.). I’m here to help! Sources: OpenAI and LiveKit partner to turn Advanced Voice into an API | SIP troubleshooting guide | LiveKit Docs | Noise & echo cancellation | LiveKit Docs | livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/experimental/realti... | Live Conversations with AI using ChatGPT and WebRTC
g
The client is Android. The tts audio output is steamed as far as I know (I'm using defaults). Not sure if I'm using livekit cloud?
t
Hey there! To see if you’re using LiveKit Cloud, just take a quick look at your LiveKit server URL’s hostname. If it ends with .livekit.cloud or .livekit.run, then you’re all set with LiveKit Cloud. Also, when you connect to the server, the response might mention the edition as "Cloud" and show a region. If you want, I’m here to help you figure this out in your setup or answer any other questions you might have about the configuration. Just let me know! Sources: livekit-android-sdk/src/main/java/io/livekit/android/room/RegionUrlProvider... | src/room/utils.ts | lib/src/support/region_url_provider.dart | livekit-api/src/signal_client/region.rs
g
Ok yeah. I am using it
t
Hey there! I totally get how frustrating choppy audio can be when using your Android client with LiveKit Cloud and AWS Transcribe. Here are some friendly tips to help smooth things out: • Enable Audio RED: This helps by sending extra audio packets, so if some get lost, your audio stays clear. • Enable WebRTC Audio Processing: Make sure echo cancellation and noise suppression are turned on in the Android SDK using `AudioCaptureOptions`—it really makes a difference. • Use LiveKit Cloud’s Enhanced Noise Cancellation: This handy feature is free and can boost your audio quality quite a bit. • Ensure Audio Sample Rate Consistency: AWS Transcribe defaults to 24,000 Hz. Matching your Android audio track to this helps avoid any weird resampling glitches. • Review Audio Routing: Using
AudioOptions
in the Android SDK to manage audio focus and routing can help prevent any unexpected interruptions. If you’re still running into issues, it might help to gather some network metrics and audio logs from your Android client so we can dig deeper. Just let me know if you want a hand with any of these steps—I’m here to help! Sources: SIP troubleshooting guide | LiveKit Docs | OpenAI and LiveKit partner to turn Advanced Voice into an API | Codecs and more | LiveKit Docs | livekit-android-sdk/src/main/java/io/livekit/android/LiveKitOverrides.kt | Noise & echo cancellation | LiveKit Docs