how can i provide SSML in TTS in livekit. can you ...
# ask-ai
e
how can i provide SSML in TTS in livekit. can you give me some example. I am using google for TTS
t
Thanks for reaching out, Vivek. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
I'm happy to help you get started with using SSML (Speech Synthesis Markup Language) in Google TTS through LiveKit! To do this, you'll want to configure your TTS instance with
enable_ssml=True
. Just a quick note: SSML works only with non-streaming (batch) synthesis, so make sure
use_streaming
is set to
False
. Here’s a simple example to show you how to enable SSML:
Copy code
from livekit.plugins import google

# Create TTS with SSML enabled (streaming must be disabled)
tts = google.TTS(
    enable_ssml=True,
    use_streaming=False,  # Required when using SSML
    voice_name="en-US-Standard-A",
    language="en-US"
)
Once you have SSML enabled, you can include SSML tags in your input text to add expressive speech effects. For instance:
Copy code
ssml_text = """
<speak>
  Hello, welcome to <emphasis level="strong">LiveKit</emphasis>.
  This text is <prosody rate="slow" pitch="+5%">spoken slowly with higher pitch</prosody>.
  Say <break time="500ms"/> to add a pause.
  This word is <say-as interpret-as="spell-out">livekit</say-as>.
</speak>
"""

# Use the TTS to synthesize SSML
audio_stream = tts.synthesize(ssml_text)
Here are some common SSML tags you might find useful: •
<prosody>
– to control rate, pitch, and volume •
<break>
– to insert pauses •
<emphasis>
– to emphasize words •
<say-as>
– to control how text is pronounced •
<sub>
– to substitute pronunciation For a complete list of supported SSML tags, you can check out the Google Cloud TTS SSML reference. It’s a great resource! If you have any questions or need more help along the way, please don’t hesitate to reach out. I’m here to support you! Sources: Google Cloud TTS integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/tts.py | Google Cloud TTS integration guide | LiveKit Docs
e
if streaming is off wouldnt it make it slow in batch as compare to streaming?
t
I'm glad you asked about streaming and batch synthesis! Streaming synthesis tends to be quicker for real-time uses because it processes and delivers audio bit by bit, which helps keep things feeling smooth and responsive. On the other hand, batch synthesis takes a bit more time since it processes the whole text before playing the audio, but it’s great if you want to use advanced features like SSML. If your goal is to have quick, low-latency interactions, streaming synthesis would be a wonderful fit. But if you’re looking to customize your speech output with SSML, batch synthesis is the way to go, even if it might be a little slower. Please feel free to reach out if you want more personalized tips or help figuring out what works best for your needs—I’m here to help! Sources: livekit-plugins/livekit-plugins-playai/livekit/plugins/playai/tts.py | Voice Assistants | livekit/livekit_composite | DeepWiki | Live Conversations with AI using ChatGPT and WebRTC
e
I did try the options you suggestions for SSML enabling but while reading it reads the whole SSML tags as well
t
It sounds like Google TTS is reading SSML tags aloud, which usually means SSML support isn’t properly enabled. To fix this, please make sure you’ve set
enable_ssml=True
when you initialize your TTS instance. Here’s a quick example to guide you:
Copy code
from livekit.plugins import google

tts = google.TTS(
    enable_ssml=True,
    use_streaming=False,  # Required for SSML
    voice_name="en-US-Standard-A",
    language="en-US"
)
Also, just a heads-up—SSML works only in non-streaming mode. So if you have
use_streaming=True
, the SSML tags won’t be processed and will be read out loud as plain text instead. Please give this a try, and feel free to reach out if you have any more questions or need a hand with anything else. I’m here to help! Sources: livekit-plugins/livekit-plugins-google/livekit/plugins/google/beta/gemini_t... | ElevenLabs TTS integration guide | LiveKit Docs
It sounds like the SSML tags are being read aloud, which usually means SSML support isn’t fully enabled in your TTS setup. To get this working smoothly, please make sure you’ve set
enable_ssml=True
when you initialize the Google TTS object. Here’s a quick example to guide you:
Copy code
from livekit.plugins import google

tts = google.TTS(
    enable_ssml=True,
    use_streaming=False,  # Required for SSML support
    voice_name="en-US-Standard-A",
    language="en-US"
)
Also, it’s important to check that your input text is valid SSML without any malformed tags. It should look something like this:
Copy code
ssml_text = """

"""

# Use the TTS to synthesize SSML
audio_stream = tts.synthesize(ssml_text)
If you’ve already enabled SSML and you’re still seeing this issue, please double-check that
use_streaming
is set to
False
, since SSML isn’t supported in streaming mode. Feel free to reach out if you need any more help—I’m here for you!