i am facing an issue with deepgram text to speech ...
# ask-ai
e
i am facing an issue with deepgram text to speech model in which the model gives distorted voice at end of sentences whether it be using .say() method or normal llm chunks , how can i solve this problem ?
t
Thanks for reaching out, Muhammad. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
To address the issue of distorted voice at the end of sentences with the Deepgram text-to-speech model, here are some steps you can take: • Check TTS Configuration: Ensure the model, encoding, and sample rate are properly configured. For example:`tts = deepgram.TTS( model="aura-2-andromeda-en", encoding="linear16", sample_rate=24000 )` • Use SSML Tags: Add pauses or adjust prosody at sentence boundaries using SSML. For example:`text = "Hello, how are you?" await session.say(text)` • Handle Text Chunking: Ensure that text is processed correctly to avoid issues at sentence boundaries. Avoid unnecessary splitting of text into chunks. • Test with Custom Pronunciation: If specific words cause distortion, use SSML or phoneme tags to define pronunciation. • Verify Audio Handling: Ensure the audio stream is properly buffered and handled on the client side. If the issue persists, could you provide more details such as: • Are you using a specific Deepgram TTS model? • Does the distortion occur consistently with certain sentences? • Are you applying any custom text processing or SSML tags? • Have you tried different audio encodings or sample rates? This information will help narrow down the cause and suggest more targeted solutions. Sources: livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/tts.py | Agent speech and audio | LiveKit Docs | Agent speech and audio | LiveKit Docs | Turn detection and interruptions | LiveKit Docs
e
i am using the default tts model andromeda yes this distortion occur after evcery 2-3 turns not on ebvery reply thats why its confusing i am not applying custom ssml as deepgram suggest they are nnot that needed i have tried different audio encodings and saple rate but the issue persists
t
I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!
j
I am having this problem also.
e
I am having this problem also.