I’m building an interactive storytelling tool and ...
# ask-ai
b
I’m building an interactive storytelling tool and I was wondering what would be the best option to support different voices. Currently I was thinking to process LLM output with
<character id="xyz">This is character speech</character>
inside the
tts_node
and assign another voice, but I’m not sure if it’s the right way forward. I could also do a tool call and add different agents handoff, but it looks too complicated (although maybe that’s the right way)
t
Thanks for reaching out, Bartosz. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
For supporting different voices in your interactive storytelling tool, there are two main approaches you can consider: 1. Using different TTS voices within an agent: You can configure the TTS component to specify different voices for your agents. For example: `tts=openai.TTS(voice="ash")`This allows you to assign specific voices to your characters. 2. Agent handoff with different voice configurations: In a multi-agent setup, you can create different agents with unique voice configurations. For instance: `llm=openai.realtime.RealtimeModel(voice="echo")`This approach is particularly useful for creating distinct characters with unique voices, personalities, and instructions, and then handing off the conversation between them as the story progresses. The multi-agent approach might seem more complex, but it offers greater flexibility and depth for interactive storytelling. Let me know if you’d like further details or assistance with implementation! Sources: Tree View: | dse-workbench
b
Do you have an example of the first one? Where can I assign different voices based on narration?
t
Here are examples of how you can assign different TTS voices for narration in LiveKit using various providers: • Using Rime TTS:
Copy code
from livekit.plugins import rime
session = AgentSession(
    tts=rime.TTS(
        model="mist",
        speaker="rainforest",  # Voice selection
        speed_alpha=0.9,
        reduce_latency=True,
    ),
    # ... llm, stt, etc.
)
Using Speechify TTS:
Copy code
from livekit.plugins import speechify
session = AgentSession(
    tts=speechify.TTS(
        model="simba-english",
        voice_id="jack",  # Voice selection
    )
    # ... llm, stt, etc.
)
Using Amazon Polly TTS:
Copy code
from livekit.plugins import aws
session = AgentSession(
    tts=aws.TTS(
        voice="Ruth",  # Voice selection
        speech_engine="generative",
        language="en-US",
    ),
    # ... llm, stt, etc.
)
Each example demonstrates how to specify a particular voice for TTS narration. Let me know if you need further assistance or additional examples! Sources: Rime integration guide | LiveKit Docs | Speechify TTS integration guide | LiveKit Docs | Amazon Polly TTS integration guide | LiveKit Docs
b
Right but where can I switch between them?
t
Currently, the provided documentation does not include specific details on dynamically switching between TTS voices during runtime in LiveKit. However, you can configure different TTS voices for different agents or sessions, which might allow for some level of switching by managing multiple agents. If you’re looking for a more dynamic solution, I recommend structuring your implementation to create separate TTS configurations for each voice and switching between them programmatically based on your narration logic. Let me know if you’d like further assistance or clarification on this!