No module named llama index LiveKit Community #ask-ai

Join Slack

No module named 'llama_index'

# ask-ai

acoustic-guitar-10279

06/06/2025, 11:53 AM

No module named 'llama_index'

tall-airline-16756

06/06/2025, 11:53 AM

Thanks for reaching out, Harshita. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

06/06/2025, 11:54 AM

The error "No module named 'llama_index'" occurs when the required

llama_index

package is not installed. To resolve this, you can install the LiveKit Plugins Llama Index package by running the following command:

Copy code

pip install livekit-plugins-llama-index

This should resolve the issue. Let me know if you need further assistance! Sources: Tree View: | dse-workbench

acoustic-guitar-10279

06/06/2025, 11:59 AM

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. livekit-plugins-llama-index 0.2.4 requires livekit-agents<1.0.0,>=0.12.20, but you have livekit-agents 1.0.22 which is incompatible.

tall-airline-16756

06/06/2025, 11:59 AM

The error indicates a dependency conflict between

livekit-plugins-llama-index

and

livekit-agents

. Specifically, the version of

livekit-agents

you have installed (1.0.22) is incompatible with the required version for

livekit-plugins-llama-index

(less than 1.0.0). To resolve this, you can downgrade

livekit-agents

to a compatible version by running:

Copy code

pip install livekit-agents<1.0.0

Alternatively, you may need to check if there is an updated version of

livekit-plugins-llama-index

that supports

livekit-agents

1.0.22. Let me know if you need further assistance! Sources: Tree View: | dse-workbench

acoustic-guitar-10279

06/06/2025, 11:59 AM

i want for greater version

tall-airline-16756

06/06/2025, 12:00 PM

Currently, the

livekit-plugins-llama-index

package is not compatible with

livekit-agents

version 1.0.22. The plugin is designed to work with

livekit-agents

versions less than 1.0.0. If you require compatibility with

livekit-agents

1.0.22, you may need to wait for an updated version of

livekit-plugins-llama-index

that supports this newer version. I recommend keeping an eye on the plugin's changelog or repository for updates. Let me know if you need further assistance! Sources: Tree View: | dse-workbench

acoustic-guitar-10279

06/07/2025, 7:03 AM

how do i add the filler words while rag search for better experience

tall-airline-16756

06/07/2025, 7:03 AM

I couldn't find specific information about adding filler words during RAG (Retrieval-Augmented Generation) searches. However, if you are referring to improving the conversational experience by including filler words, this might involve customizing the language model or post-processing the generated responses to include such words. Could you clarify your use case or provide more details? For example, are you working with a specific tool or framework? This will help me provide more targeted assistance.

acoustic-guitar-10279

06/07/2025, 7:04 AM

livekit

tall-airline-16756

06/07/2025, 7:04 AM

I couldn't find specific information about adding filler words in the context of LiveKit and RAG (Retrieval-Augmented Generation) searches. If you could provide more details about your use case or the specific functionality you're looking for, I'd be happy to assist further!

acoustic-guitar-10279

06/08/2025, 2:33 PM

when i ask multiple questions back to back to the agent it does multiple tool call and endup getting this error | tool_call_ids did not have response messages: call_5k1mtF6Hr26NfTtXcyJIxAmB", 'type': 'invalid_request_error', 'param': 'messages.[21].role', 'code': None}} (status_code=400, request_id=req_8bd59e4373ca7f9d68be70676abac8f6, body={'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_5k1mtF6Hr26NfTtXcyJIxAmB", 'type': 'invalid_request_error', 'param': 'messages.[21].role', 'code': None}) {"llm": "livekit.plugins.openai.llm.LLM", "attempt": 2, "pid": 14423, "job_id": "AJ_WH8syR8L83cZ"} 2025-06-07 145120,317 - WARNING livekit.agents - failed to generate LLM completion, retrying in 2.0s

tall-airline-16756

06/08/2025, 2:34 PM

The error you encountered, "tool_call_ids did not have response messages," occurs when tool messages are missing their corresponding

tool_call_id

responses. This can happen if multiple tool calls are made back-to-back without proper handling of their responses. To address this issue, ensure that each tool call is followed by a response message that includes the

tool_call_id

. If you're using custom code, you may need to implement error handling to manage tool call outputs properly. For example, iterate through tool outputs and submit them before continuing, as shown in the following snippet:

Copy code

# Iterate tool outputs and submit them
for msg in self._chat_ctx.messages:
    if msg.role == "tool":
        if not msg.tool_call_id:
            logger.error("tool message without tool_call_id")
            continue
        await self._llm._submit_tool_call_result(msg.tool_call_id, msg.content)

Let me know if you need further clarification or assistance! Sources: Tree View: | dse-workbench

acoustic-guitar-10279

06/08/2025, 2:38 PM

# basic modular code import asyncio from collections import defaultdict from decimal import Decimal import logging import os from datetime import datetime from bson import Decimal128, ObjectId from dotenv import load_dotenv from typing import Optional from livekit.agents import ( Agent, AgentSession, ChatContext, JobContext, JobProcess, RoomInputOptions, RoomOutputOptions, RunContext, WorkerOptions, cli, metrics, ) import pandas as pd from email_script import summaryofthecall from livekit.agents.voice import MetricsCollectedEvent from livekit.plugins import silero import json from pymongo import MongoClient import pytz from services.loader import load_service, calculate_cost from pathlib import Path from utils import analyse_call_transcript from livekit.agents.llm import function_tool from livekit.agents.job import get_current_job_context from livekit import api import random from llama_index.core import ( SimpleDirectoryReader, StorageContext, VectorStoreIndex, load_index_from_storage, ) # from livekit.plugins import noise_cancellation logger = logging.getLogger("multi-agent") logging.getLogger('pymongo').setLevel(logging.WARNING) # load_dotenv(dotenv_path=".env") load_dotenv(dotenv_path=".env.local") #-----------------rag------------------------- # check if storage already exists THIS_DIR = Path(file).parent PERSIST_DIR = THIS_DIR / "query-engine-storage" if not PERSIST_DIR.exists(): # load the documents and create the index documents = SimpleDirectoryReader(THIS_DIR / "data").load_data() index = VectorStoreIndex.from_documents(documents) # store it for later index.storage_context.persist(persist_dir=PERSIST_DIR) else: # load the existing index storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR) index = load_index_from_storage(storage_context) metrics_store = defaultdict(dict) DEFAULT_LANGUAGE = os.getenv("DEFAULT_LANGUAGE") DEFAULT_CONTEXT = os.getenv("DEFAULT_CONTEXT") mongo_uri = os.getenv("MONGODB_URI") groq_llm_model = load_service("llm", "groq") polly_tts_model = load_service("tts", "aws") # elevenlabs_tts_model = load_service("tts", "elevenlabs") # cartesia_tts_model = load_service("tts","cartesia") google_tts_model = load_service("tts","google") deepgram_stt_model = load_service("stt","deepgram") from livekit.agents import get_job_context class Assistant(Agent): def __init__( self, instructions: str, chat_ctx: Optional[ChatContext] = None ) -> None: super().__init__( instructions=instructions, chat_ctx=chat_ctx, # tools=[query_info], ) @function_tool async def on_enter(self): self.session.generate_reply( instructions="initial greetings mostly 6 words.", allow_interruptions=False ) # async def hangup_call(self): # ctx = get_current_job_context() # if ctx is None: # # Not running in a job context # return # await ctx.api.room.delete_room( # api.DeleteRoomRequest( # room=ctx.room.name, # ) # ) # @function_tool # async def query_info(self, query: str) -> str: # """Get more information about a specific topic""" # print(index) # await self.session.generate_reply( # instructions=f"you are processing the query", allow_interruptions=False # ) # query_engine = index.as_query_engine(use_async=True) # res = await query_engine.aquery(query) # print("Query result:", res) # return str(res) @function_tool async def answer_customer_query(self, query: str) -> str: "Call this tool to fetch accurate answers from a contextual knowledge base." "Use when the user asks something that requires specific or factual information related to the organization." print("Querying for----------------->:", query) # self.session.say( # "please wait a moment", allow_interruptions=False # ) thinking_messages = [ "Hmm!, checking that real quick...", "Okay, give me a second while I look that up...", "Let me find that for you..", "Uh-huh, let me take a quick look into that...", "Okay Hang on, I’m on it...", "Alright, let me see", "One moment, please hold while I check...", "I understand ,let me find that information for you...", "" ] await self.session.say(random.choice(thinking_messages)) # self.session.generate_reply( user_input="query",instructions="say a conversational filler words such as {filler} .You must not exceed 6 words.You must never answer the user query.Dont use the same filler words consecutively ", allow_interruptions=False) logger.info(f"Querying after the filer word --------for: {query}") query_engine = index.as_query_engine(use_async=True) res = await query_engine.aquery(query) # self.session.generate_reply( # instructions=f"answer the user query: {query} using this knowledge {res}", allow_interruptions=False # ) return str(res) @function_tool async def conversation_finished(self): """(when the user confirms they don't want you anymore), call this function to end the conversation.""" # interrupt any existing generation self.session.interrupt() # generate a goodbye message and hang up # awaiting it will ensure the message is played out before returning await self.session.generate_reply( instructions=f"say goodbye", allow_interruptions=False ) job_ctx = get_current_job_context() # await self.test_aggregation() await job_ctx.api.room.delete_room(api.DeleteRoomRequest(room=job_ctx.room.name)) # @function_tool # async def end_call(self, ctx: RunContext): # """Called when the user wants to end the call""" # # let the agent finish speaking # current_speech = ctx.session.current_speech # if current_speech: # await current_speech.wait_for_playout() # await asyncio.sleep(1) # await self.hangup_call() async def entrypoint(ctx: JobContext): await ctx.connect() call_started_at = datetime.utcnow() ist = pytz.timezone("Asia/Kolkata") now = datetime.now(ist).strftime("%Y-%m-%d %H:%M") room_name = ctx.room.name participant = await ctx.wait_for_participant() agent_metadata = json.loads(participant.metadata) client = ctx.proc.userdata.get("mongo_client") or MongoClient(mongo_uri) db = client["nextjsdb"] metrics_collection = db["metrics"] aggregated_collection = db["aggregated_metrics"] users_collection = db["outbound_call_data"] user_db = db["users"] # Get STT related data # stt_selected = os.getenv("DEFAULT_STT_MODEL") # stt_model_selected = os.getenv("DEEPGRAM_STT_MODEL") stt_selected = agent_metadata.get("stt","").lower() stt_model_selected = agent_metadata.get("sttModel","") languageselected = agent_metadata.get( "inputLanguage", os.getenv("DEFAULT_LANGUAGE") ) # Get TTS related data tts_selected = agent_metadata.get("tts", "").lower() tts_model_selected = agent_metadata.get("ttsModel", "").lower() tts_language_code = agent_metadata.get("ttsLanguage") tts_gender = agent_metadata.get("gender").lower() voicename = agent_metadata.get("ttsVoiceName", os.getenv("DEFAULT_VOICE")) # Get LLM related data selected_llm = agent_metadata.get("llm", os.getenv("DEFAULT_LLM_MODEL")).lower() selected_llm_model = agent_metadata.get("llmModel", "").lower() # Get other metadata prompt = agent_metadata.get("prompt", os.getenv("DEFAULT_CONTEXT")) user_id = agent_metadata.get("userId") agent_id = agent_metadata.get("agentId", "") # Debug prints to verify data print("Agent Metadata:", agent_metadata) print("User ID:", user_id) print("\n--------------------------- STT Config ----------------------------") print(f"Language: {languageselected}") print(f"STT Model: {stt_model_selected}") print("\n--------------------------- TTS Config ----------------------------") print(f"TTS Provider: {tts_selected}") print(f"TTS Model: {tts_model_selected}") print(f"Voice: {voicename}") print(f"Language Code: {tts_language_code}") print(f"Gender: {tts_gender}") print("\n--------------------------- LLM Config ----------------------------") print(f"LLM Provider: {selected_llm}") print(f"LLM Model: {selected_llm_model}") call_doc = { "room_name": ctx.room.name, "started_at": now, "user_id": ObjectId(user_id) if user_id else None, "call_type": "Web", "call_direction": "inbound", "post_call_info": {}, # blank at start "metadata": {"agentid": agent_id}, "agent_config": agent_metadata, } users_collection.insert_one(call_doc) print("inserted the metadata in mongo") instructions = ( "You interact with users via spoken audio. and you gender is {tts_gender}\n" "Follow these core behavior rules at all times:\n" "- Use short, natural sentences (under 20 words).\n" f"- Do not call {Assistant.conversation_finished} during rag search if user ask multiple question" "- invoke the tool

answer_customer_query

, when the user asks something that requires specific or factual information.\n" "- Avoid unpronounceable punctuation and overly complex phrases.\n" "- Speak clearly and concisely.\n" "- Maintain a friendly and helpful tone.\n" f"- Respond in the same language the user speaks: {languageselected}.\n" f"- Current date and time: {now}.\n\n" "=== Instructions ===\n" "You must follow these instructions carefully in addition to the above voice rules:\n\n" f"{prompt}\n" "- **Never say or output structured JSON like {'name': 'conversation_finished', 'parameters': {}}. or any code related language**\n" f"- only call the tool {Assistant.conversation_finished} when it seems like user wants to end the conversation" "- Only respond with natural spoken language intended for the user.\n" "=== End of Custom Instructions ===" )

tall-airline-16756

06/08/2025, 2:39 PM

Thank you for sharing the detailed code snippet. It seems like you're implementing a modular assistant using LiveKit with various functionalities, including RAG search, STT, TTS, and LLM integrations. If you have a specific question or issue you'd like assistance with, please let me know, and I'll do my best to help!

acoustic-guitar-10279

06/08/2025, 2:39 PM

im getting this error when i ask multiple questions back to back to the agent it does multiple tool call and endup getting this error | tool_call_ids did not have response messages: call_5k1mtF6Hr26NfTtXcyJIxAmB", 'type': 'invalid_request_error', 'param': 'messages.[21].role', 'code': None}} (status_code=400, request_id=req_8bd59e4373ca7f9d68be70676abac8f6, body={'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_5k1mtF6Hr26NfTtXcyJIxAmB", 'type': 'invalid_request_error', 'param': 'messages.[21].role', 'code': None}) {"llm": "livekit.plugins.openai.llm.LLM", "attempt": 2, "pid": 14423, "job_id": "AJ_WH8syR8L83cZ"} 2025-06-07 145120,317 - WARNING livekit.agents - failed to generate LLM completion, retrying in 2.0s

tall-airline-16756

06/08/2025, 2:40 PM

The error you're encountering, where multiple tool calls back-to-back result in missing

tool_call_id

responses, suggests that the system is not properly handling or awaiting responses for each tool call before initiating the next one. This can lead to invalid request errors. To resolve this, you may need to ensure that each tool call is completed and its response is processed before making another tool call. This could involve: 1. Sequential Execution: Modify your code to await the completion of each tool call before initiating the next one. 2. Error Handling: Implement error handling to catch and log any issues with tool call responses. 3. Debugging: Add logging to track the flow of tool calls and responses to identify where the issue occurs. If you can share more details about how the tool calls are being made or provide a specific code snippet, I can assist further in troubleshooting the issue.

Open in Slack

Previous Next