Why the agent takes too much time to call the openai api whe LiveKit Community #ask-ai

Join Slack

Why the agent takes too much time to call the open...

# ask-ai

cold-pharmacist-78511

02/17/2025, 6:11 AM

Why the agent takes too much time to call the openai api when doing speech to speech, after joining the room

gray-monkey-83525

02/17/2025, 6:11 AM

It seems likely that the delay in calling the OpenAI API for speech-to-speech interactions after joining a room could be due to several factors. According to the LiveKit documentation, the OpenAI Realtime API is used for low-latency audio streaming, but the initial connection setup and API call can introduce some latency. Additionally, network latency, server location, and the processing time of the models involved (STT, LLM, TTS) can contribute to the delay. My understanding is that optimizing the location of your LiveKit server to be closer to the OpenAI servers and ensuring that your network connection is stable and fast can help reduce this latency. Additionally, using the MultimodalAgent class, which is designed for real-time interactions, might help in managing the conversation state more efficiently, potentially reducing delays. Would you like more detailed information on optimizing your setup to reduce latency? Sources: - LiveKit documentation - MultimodalAgent 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

cold-pharmacist-78511

02/17/2025, 3:28 PM

@millions-hair-24956, i need to load the chat history, or mantain chat history of my speech to speech interaction from the agent side in nodejs. How to send previous messages to the ai? or mantain a chat session or something?

gray-monkey-83525

02/17/2025, 3:28 PM

To maintain chat history for speech-to-speech interactions and send previous messages to the AI in Node.js, you can use the

chat_ctx

property of the

VoicePipelineAgent

MultimodalAgent

. This property holds the chat context, which you can manipulate to include previous messages. Here's a general approach: 1. Listen to Events: Use the

user_speech_committed

and

agent_speech_committed

events to capture and store messages as they are committed to the chat context. This allows you to maintain a history of the conversation. 2. Store Chat History: Save the chat history to a database or a file whenever a message is committed. This ensures that you can retrieve the history when needed. 3. Load Previous Messages: When starting a new session, load the stored chat history and append it to the

chat_ctx

before sending it to the AI. This can be done by iterating over the stored messages and appending them to the

chat_ctx

. 4. Use the
chat_ctx
: When sending a new message to the AI, ensure that the

chat_ctx

includes the previous messages to provide context for the AI's response. Would you like more detailed guidance on implementing this in Node.js? Sources: - VoicePipelineAgent 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

cold-pharmacist-78511

02/17/2025, 3:29 PM

can u give me a code snipped in nodejs to achieve this? @millions-hair-24956

gray-monkey-83525

02/17/2025, 3:29 PM

To maintain and load chat history in a Node.js application using LiveKit's

VoicePipelineAgent

, you can follow this example. This snippet demonstrates how to capture chat messages, store them, and load them into the chat context for future sessions:

Copy code

const { VoicePipelineAgent, rtc, llm } = require('livekit');
const fs = require('fs');

// Initialize the agent
const agent = new VoicePipelineAgent({
  // Initialize with your models and configurations
});

// Function to save chat history to a file
function saveChatHistory(chatCtx) {
  const history = chatCtx.messages.map(msg => ({
    role: msg.role,
    content: msg.content,
  }));
  fs.writeFileSync('chat_history.json', JSON.stringify(history));
}

// Function to load chat history from a file
function loadChatHistory() {
  if (fs.existsSync('chat_history.json')) {
    const history = JSON.parse(fs.readFileSync('chat_history.json'));
    const chatCtx = new llm.ChatContext();
    history.forEach(msg => chatCtx.append(msg));
    return chatCtx;
  }
  return new llm.ChatContext();
}

// Listen to user speech committed event
agent.on('user_speech_committed', (msg) => {
  agent.chatCtx.append({ role: 'user', content: msg.content });
  saveChatHistory(agent.chatCtx);
});

// Listen to agent speech committed event
agent.on('agent_speech_committed', (msg) => {
  agent.chatCtx.append({ role: 'assistant', content: msg.content });
  saveChatHistory(agent.chatCtx);
});

// Load previous chat history when starting a new session
agent.chatCtx = loadChatHistory();

// Start the agent in a room
agent.start(room, participant);

This code snippet demonstrates how to handle chat history by saving it to a file and loading it when needed. You can replace the file storage with a database or other storage solutions as required. Would you like more details on any specific part of this implementation? Sources: - VoicePipelineAgent 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

cold-pharmacist-78511

02/17/2025, 3:33 PM

Copy code

const agent = new multimodal.MultimodalAgent({ model, fncCtx });
    const session = await agent
      .start(ctx.room, participant)
      .then((session) => session as openai.realtime.RealtimeSession);

cold-pharmacist-78511

02/17/2025, 3:33 PM

my code is like this @millions-hair-24956

gray-monkey-83525

02/17/2025, 3:33 PM

To maintain chat history with the

MultimodalAgent

in Node.js, you can follow a similar approach to the one I described earlier. Here's how you can modify your code to include chat history management:

Copy code

const { multimodal, llm } = require('livekit');
const fs = require('fs');

// Initialize the MultimodalAgent
const agent = new multimodal.MultimodalAgent({ model, fncCtx });

// Function to save chat history to a file
function saveChatHistory(chatCtx) {
  const history = chatCtx.messages.map(msg => ({
    role: msg.role,
    content: msg.content,
  }));
  fs.writeFileSync('chat_history.json', JSON.stringify(history));
}

// Function to load chat history from a file
function loadChatHistory() {
  if (fs.existsSync('chat_history.json')) {
    const history = JSON.parse(fs.readFileSync('chat_history.json'));
    const chatCtx = new llm.ChatContext();
    history.forEach(msg => chatCtx.append(msg));
    return chatCtx;
  }
  return new llm.ChatContext();
}

// Load previous chat history
const chatCtx = loadChatHistory();

// Start the agent and handle session
agent.start(ctx.room, participant).then((session) => {
  session.on('user_speech_committed', (msg) => {
    chatCtx.append({ role: 'user', content: msg.content });
    saveChatHistory(chatCtx);
  });

  session.on('agent_speech_committed', (msg) => {
    chatCtx.append({ role: 'assistant', content: msg.content });
    saveChatHistory(chatCtx);
  });

  // Set the chat context for the session
  session.setChatCtx(chatCtx);
});

This code snippet demonstrates how to manage chat history by saving it to a file and loading it when starting a new session. You can replace file storage with a database or other storage solutions as needed. Would you like more details on any specific part of this implementation? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1733317598.969739 - https://docs.livekit.io/recipes/react-chat/ 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

7 Views

Open in Slack

Previous Next