Is there any way to add a custom stt implementatio...
# ask-ai
l
Is there any way to add a custom stt implementation to livekit. Eapecially for the new scribe model from elevenlabs? (Cf. https://elevenlabs.io/docs/api-reference/speech-to-text/convert)
t
Thanks for reaching out, Denis. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
I'm glad you're interested in integrating a custom speech-to-text (STT) implementation for the ElevenLabs Scribe model in LiveKit! To help you get started smoothly, here’s a simple step-by-step guide using the
livekit-plugins-elevenlabs
package: 1. Install the Plugin: First, let's install the ElevenLabs plugin for LiveKit Agents with pip:
pip install "livekit-agents[elevenlabs]~=1.2"
2. Authentication: You'll need an ElevenLabs API key, which you can get from here. Once you have it, set it as an environment variable like this:
ELEVEN_API_KEY=your_api_key_here
Alternatively, you can pass the API key directly when you initialize the STT instance—whichever works best for you. 3. Initialize the STT Instance: Next, use the
STT
class from the
livekit.plugins.elevenlabs
module. Here’s an example to guide you:
Copy code
from livekit.plugins import elevenlabs

stt_instance = elevenlabs.STT(
    language_code="en",  # Optional: specify language
    tag_audio_events=True  # Optional: include audio event tags like (laughter)
)
4. Integrate into AgentSession: Finally, incorporate the STT instance into an
AgentSession
or
VoicePipelineAgent
. For example:
Copy code
session = AgentSession(
    stt=stt_instance,
    tts=elevenlabs.TTS(),  # Optional TTS
    # ... other components
)
This integration supports non-streaming transcription with interim results and conveniently uses the
scribe_v1
model by default. For more details and additional tips, feel free to check out the STT integrations overview in the LiveKit documentation. If you have any questions along the way, I’m here to help! Sources: ElevenLabs TTS integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.p...
l
How exactly can I add for this stt model a custom vocabulary?
t
Thanks so much for reaching out! Right now, the ElevenLabs Scribe STT model within LiveKit doesn’t support adding a custom vocabulary directly. The ElevenLabs STT plugin uses settings like
api_key
,
language_code
, and
tag_audio_events
, but unfortunately, custom vocabulary isn’t part of its features yet. If having custom vocabulary is really important for your project, I’d be happy to help you explore other STT plugins available in LiveKit, like Gladia or Amazon Transcribe, which do offer that functionality. Please feel free to ask if you want more details or any assistance—I’m here to help! Sources: livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.p... | Amazon Transcribe STT integration guide | LiveKit Docs | ElevenLabs TTS integration guide | LiveKit Docs