able-jackal-84140
07/10/2025, 6:07 PMtall-airline-16756
07/10/2025, 6:07 PMtall-airline-16756
07/10/2025, 6:08 PMsession.say()
method with the add_to_chat_ctx
parameter set to `False`:
await session.say(
"Your text here",
add_to_chat_ctx=False,
allow_interruptions=True,
)
This ensures the spoken text isn't added to the chat or application context.
Alternatively, you can use a standalone TTS instance directly for even more control, bypassing session context entirely. This approach involves creating a TTS plugin instance and managing audio frames yourself.
If you need details about configuring a standalone TTS or require sample code, let me know.
Sources: Agent speech | LiveKit Docs | Text-to-speech (TTS) integrations | LiveKit Docsable-jackal-84140
07/10/2025, 6:09 PMtall-airline-16756
07/10/2025, 6:09 PMfrom livekit import rtc
from livekit.agents.tts import SynthesizedAudio
from livekit.plugins import elevenlabs
# Create a TTS instance using the ElevenLabs plugin (adjust model as needed)
tts = elevenlabs.TTS(model_id="eleven_turbo_v2")
tts_stream = tts.stream()
audio_source = rtc.AudioSource(44100, 1)
track = rtc.LocalAudioTrack.create_audio_track("agent-audio", audio_source)
await ctx.room.local_participant.publish_track(track)
tts_stream.push_text("Your text here")
tts_stream.end_input()
async for synthesized_audio in tts_stream:
await audio_source.capture_frame(synthesized_audio.frame)
• tts = elevenlabs.TTS(...): Feel free to swap this out with your favorite TTS provider and model.
• tts_stream.push_text(...): Just drop in the text you want to turn into audio.
• audio_source.capture_frame(...): This nifty bit processes each audio frame directly.
• No worries about messing with the agent session context!
The best part? You can totally customize the plugin, model, and audio source to fit exactly what you need. It's like audio tailoring! 🎧