busy-monitor-72069
09/14/2025, 2:49 AMtall-airline-16756
09/14/2025, 2:49 AMtall-airline-16756
09/14/2025, 2:49 AMllm.chat()
method to get responses from both LLMs at the same time.
2. Implement Selection Logic: Then, apply your own criteria—like which response feels more relevant, higher quality, or quicker—to pick the best one.
3. Process Through TTS: Finally, send the chosen response to the TTS engine using the performTTSInference
function or the session.say()
method so it plays back in real time.
Here’s a simple example to illustrate how you might set this up:
class MultiLLMAgent(Agent):
def choose_response(self, resp1: str, resp2: str) -> str:
return resp1 if len(resp1) < len(resp2) else resp2
async def onUserTurnCompleted(self, chat_ctx: ChatContext, user_msg: ChatMessage):
resp1_task = asyncio.create_task(self.llm.chat(chat_ctx=chat_ctx))
resp2_task = asyncio.create_task(self.secondary_llm.chat(chat_ctx=chat_ctx))
resp1, resp2 = await asyncio.gather(resp1_task, resp2_task)
selected = self.choose_response(resp1.response, resp2.response)
self.session.say(selected)
This way, you can tailor the selection process just how you like and smoothly bring in TTS for a natural voice output. If you have any questions or want to dive deeper into any part, I’m here to help—just let me know!
Sources: examples/voice_agents/structured_output.py | Tool definition and use | LiveKit Docs | agents/src/voice/agent_activity.ts