LiveKit Community

Can I use gemini live API with text as output and use elevenlabs as TTS

Thanks for reaching out, Avadhootha. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.


You can use the Gemini Live API to generate text output and then use ElevenLabs for text-to-speech (TTS) conversion. To achieve this:
• Configure Gemini Live to output text (not audio).
• Capture the text output from Gemini Live.
• Send the captured text to ElevenLabs TTS for speech synthesis.
This approach is supported within the LiveKit Agents framework. While Gemini Live can generate audio directly, using the text output allows you to route it through ElevenLabs for custom TTS voices or features.
Example code for ElevenLabs TTS integration:
```eleven_tts = elevenlabs.TTS(model="eleven_turbo_v2_5", voice=elevenlabs.Voice(id="EXAVITQu4vr4xnSDxMaL", name="Bella"))```
This setup gives you flexibility to use Gemini Live for text generation and ElevenLabs for advanced TTS capabilities.
 Sources: <https://docs.livekit.io/agents/build/speech/|Agent speech | LiveKit Docs> | <https://docs.livekit.io/agents/integrations/plugins/|Working with plugins | LiveKit Docs> | <https://docs.livekit.io/agents/integrations/realtime/gemini/|Gemini Multimodal Live integration guide | LiveKit Docs>