How to get modify tts_node and stt_node fuctions...
# ask-ai
m
How to get modify tts_node and stt_node fuctions in entrpoint
t
Thanks for reaching out, Rajadurai. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
I'd be happy to help you modify the
tts_node
and
stt_node
functions! To do this, you’ll want to override these methods in a custom
Agent
class. Then, simply pass an instance of your custom agent to the
AgentSession.start()
method inside the
entrypoint
function. The great thing is, the
entrypoint
itself stays mostly the same, except for initializing your custom agent. Here’s a friendly example to guide you on how to override these methods: Customizing `tts_node`:
Copy code
async def tts_node(self, text: AsyncIterable[str], model_settings: ModelSettings):
    async def modify_text():
        async for chunk in text:
            yield chunk.replace("lol", "<laugh>")

    return Agent.default.tts_node(self, modify_text(), model_settings)
Customizing `stt_node`:
Copy code
async def stt_node(self, audio: AsyncIterable[rtc.AudioFrame], model_settings: ModelSettings):
    async for event in Agent.default.stt_node(self, audio, model_settings):
        yield event
And finally, here’s how you can integrate your custom agent into the `entrypoint`:
Copy code
async def entrypoint(ctx: JobContext):
    session = AgentSession()
    await session.start(
        agent=MyCustomAgent(),  # Your agent with overridden nodes
        room=ctx.room
    )
If you have any questions or need a hand with anything else, please don’t hesitate to reach out. I’m here to help!
m
I have multiple agents so but I need same modifications for all agents
t
To help you apply the same
tts_node
and
stt_node
modifications across multiple agents smoothly, a great approach is to create a base agent class that holds the shared logic. Then, your individual agents can inherit from this base class. This way, you get consistent behavior while still having the flexibility for each agent to add their own special touches if needed. Here’s a friendly example in Python:
Copy code
class SharedNodeAgent(Agent):
    async def tts_node(self, text: AsyncIterable[str], model_settings: ModelSettings):
        async def process_text():
            async for chunk in text:
                yield chunk.replace("lol", "<laugh>")
        return Agent.default.tts_node(self, process_text(), model_settings)

    async def stt_node(self, audio: AsyncIterable[rtc.AudioFrame], model_settings: ModelSettings):
        events = Agent.default.stt_node(self, audio, model_settings)
        async def postprocess_events():
            async for event in events or []:
                if hasattr(event, 'text'):
                    event.text = event.text.replace("um", "").replace("uh", "").strip()
                yield event
        return postprocess_events()

class AgentA(SharedNodeAgent):
    def __init__(self):
        super().__init__(instructions="You are Agent A.", stt=deepgram.STT(), llm=openai.LLM(), tts=rime.TTS(), vad=silero.VAD.load())

class AgentB(SharedNodeAgent):
    def __init__(self):
        super().__init__(instructions="You are Agent B.", stt=deepgram.STT(), llm=openai.LLM(), tts=rime.TTS(), vad=silero.VAD.load())
Here’s what’s happening: • The
SharedNodeAgent
class holds the shared logic for both
tts_node
and
stt_node
. • Each specific agent, like
AgentA
and
AgentB
, inherits from
SharedNodeAgent
and can customize further if they want. If you’d like, I’m here to help you walk through implementing this or answer any questions you have! Sources: Pipeline nodes and hooks | LiveKit Docs | Pipeline nodes and hooks | LiveKit Docs | pipeline-tts/tts_node.py