wooden-beard-26644
08/12/2025, 8:11 PMtall-airline-16756
08/12/2025, 8:11 PMtall-airline-16756
08/12/2025, 8:11 PMgenerate_reply
runs with tool_choice
set to "none" so no extra tools run, making sure we don’t end up talking over ourselves. This way, everything happens in order, keeping the conversation clear and easy to follow without any repeated or overlapping speech.wooden-beard-26644
08/12/2025, 8:13 PMtall-airline-16756
08/12/2025, 8:13 PMgenerate_reply
method kicks in with the user's message. 2. LLM Reply Generation: The generate_reply
method creates a SpeechHandle
and begins the LLM inference. The LLM then crafts a response, which might include both text and tool calls. 3. Reply to TTS: The text part of the LLM’s response is sent over to TTS to generate the audio. 4. Tool Execution: If there are any tool calls in the LLM’s response, they get executed. The results from these tools are added back into the chat context. 5. Tool Call Response Handling: When the tool execution needs a reply, the system uses those tool results to create a fresh LLM response. 6. Final Reply: This new LLM response is turned into speech and played back to the user. This flow keeps everything smooth and in order, so you won’t run into issues like talking over yourself. Sources: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.pytall-airline-16756
08/12/2025, 9:25 PM