helpful-sugar-58738
02/25/2025, 6:20 PMgray-monkey-83525
02/25/2025, 6:20 PMVoicePipelineAgent
to use this custom LLM by passing it as the llm
parameter.
Here's a basic example of how you might structure a custom LLM class in Python:
class CustomLLM(llm.LLM):
def __init__(self, api_url, api_key):
super().__init__()
self.api_url = api_url
self.api_key = api_key
self.client = aiohttp.ClientSession()
async def chat(self, chat_ctx, fnc_ctx=None, temperature=None, n=1, parallel_tool_calls=None):
# Implement your API call here
response = await self.client.post(self.api_url, json={"key": self.api_key, "context": chat_ctx})
return response.json()
async def aclose(self):
await self.client.close()
Would you like more detailed guidance on integrating this with LiveKit's VoicePipelineAgent
?
Sources:
- LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 6:24 PMhelpful-sugar-58738
02/25/2025, 6:25 PMgray-monkey-83525
02/25/2025, 6:25 PMVoicePipelineAgent
, follow these steps:
1. Create a Custom LLM Class: Define a class that inherits from llm.LLM
and implements the chat
method to interact with your LLM server. This method should handle sending requests to your server and processing the responses.
2. Initialize the Custom LLM: When setting up the VoicePipelineAgent
, pass an instance of your custom LLM class as the llm
parameter.
3. Configure the Agent: Set up the VoicePipelineAgent
with the necessary components like VAD, STT, and TTS, along with your custom LLM.
Here's a step-by-step breakdown with code:
Step 1: Create a Custom LLM Class
import aiohttp
from livekit.agents import llm
class CustomLLM(llm.LLM):
def __init__(self, api_url, api_key):
super().__init__()
self.api_url = api_url
self.api_key = api_key
self.client = aiohttp.ClientSession()
async def chat(self, chat_ctx, fnc_ctx=None, temperature=None, n=1, parallel_tool_calls=None):
# Prepare the request payload
payload = {
"api_key": self.api_key,
"context": chat_ctx.messages
}
# Send the request to your LLM server
async with self.client.post(self.api_url, json=payload) as response:
if response.status == 200:
data = await response.json()
# Process and return the response as an LLMStream
return llm.LLMStream(data['response'])
else:
raise Exception(f"Failed to get response: {response.status}")
async def aclose(self):
await self.client.close()
Step 2: Initialize the Custom LLM
custom_llm = CustomLLM(api_url="https://your-llm-server.com/api", api_key="your_api_key")
Step 3: Configure the VoicePipelineAgent
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import silero, deepgram, cartesia
# Initialize other components
vad = silero.VAD.load()
stt = deepgram.STT(model="nova-2-general")
tts = cartesia.TTS()
# Create the agent
agent = VoicePipelineAgent(
vad=vad,
stt=stt,
llm=custom_llm,
tts=tts,
chat_ctx=llm.ChatContext().append(role="system", text="Your initial prompt"),
allow_interruptions=True,
interrupt_speech_duration=0.5,
interrupt_min_words=0,
min_endpointing_delay=0.5
)
# Start the agent in a room with a participant
# agent.start(room, participant)
gray-monkey-83525
02/25/2025, 6:25 PMchat
method sends the chat context to your server and processes the response.
- VoicePipelineAgent: This agent is configured with your custom LLM and other necessary components like VAD, STT, and TTS.
- Agent Initialization: The agent is initialized with the custom LLM and other components, and can be started in a LiveKit room.
This setup allows you to use your custom LLM server within the LiveKit framework, leveraging the VoicePipelineAgent
for voice interactions.
Sources:
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1732128458.518759
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1728788979.238089 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 7:01 PMgray-monkey-83525
02/25/2025, 7:01 PMTypeError
related to an object that cannot be serialized, possibly when trying to convert it to JSON. This error often occurs when using Python's json
module to serialize objects that are not natively serializable, such as custom classes or complex data types.
To resolve this, you can implement a custom JSON encoder by subclassing json.JSONEncoder
and overriding the default
method to handle the serialization of your custom objects. Here's a basic example:
import json
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
# Check if the object is an instance of a custom class
if isinstance(obj, YourCustomClass):
return obj.to_dict() # Convert the object to a dictionary
# Call the base class method for other types
return super().default(obj)
# Usage
data = {
'key': YourCustomClass() # Replace with your actual object
}
json_data = json.dumps(data, cls=CustomEncoder)
Would you like more detailed guidance on implementing this in your specific context?
Sources:
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1732128458.518759
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1731158203.654409 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 7:07 PMgray-monkey-83525
02/25/2025, 7:07 PMChatMessage
object is not JSON serializable. This typically happens when you try to serialize an object that Python's json
module doesn't know how to handle. To resolve this, you need to convert the ChatMessage
object into a format that can be serialized, such as a dictionary.
Here's how you can modify your code to handle this:
1. Convert ChatMessage
to a Dictionary: Ensure that the ChatMessage
class has a method to convert its instances to a dictionary. If it doesn't, you can add a method like as_dict()
to do this.
2. Use the Conversion Method: Before sending the ChatMessage
object in your API request, convert it to a dictionary.
Here's an example of how you might implement this:
# Assuming ChatMessage is a class you have defined
class ChatMessage:
def __init__(self, message, role):
self.message = message
self.role = role
def as_dict(self):
return {
"message": self.message,
"role": self.role
}
# When preparing the payload
chat_message = ChatMessage("Hello, how can I help you?", "user")
payload = {
"api_key": "your_api_key",
"context": [chat_message.as_dict()] # Convert to dict
}
# Sending the request
async with self.client.post(self.api_url, json=payload) as response:
# Handle the response
This approach ensures that the ChatMessage
object is converted into a JSON-serializable format before being included in the request payload.
Would you like more detailed guidance on implementing this in your specific context?
Sources:
- https://docs.livekit.io/python/livekit/rtc/chat.html#livekit.rtc.chat.ChatMessage 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 8:53 PMgray-monkey-83525
02/25/2025, 8:53 PMLLMStream
without providing an implementation for its abstract method _run
. In Python, abstract classes are meant to be subclassed, and any abstract methods must be implemented in the subclass.
Here's how you can resolve this:
1. Create a Subclass: Define a subclass of LLMStream
and implement the _run
method. This method should contain the logic for processing the LLM stream.
2. Implement the _run
Method: Ensure that your implementation of _run
handles the specific tasks you need, such as processing data from your LLM server.
Here's an example of how you might implement this:
from livekit.agents.llm import LLMStream
class CustomLLMStream(LLMStream):
def __init__(self, llm, chat_ctx, fnc_ctx=None):
super().__init__(llm=llm, chat_ctx=chat_ctx, fnc_ctx=fnc_ctx)
async def _run(self):
# Implement your logic here
# For example, process the LLM response and yield results
async for chunk in self._llm_response():
# Process each chunk
yield chunk
async def _llm_response(self):
# Simulate an LLM response stream
# Replace this with actual logic to get data from your LLM server
yield "response_chunk_1"
yield "response_chunk_2"
# Usage
custom_stream = CustomLLMStream(llm=your_llm_instance, chat_ctx=your_chat_context)
Explanation
- CustomLLMStream: This class inherits from LLMStream
and provides an implementation for the _run
method.
- _run Method: This method should contain the logic for handling the LLM stream, such as processing responses from your LLM server.
By implementing the _run
method, you ensure that the LLMStream
can be instantiated without errors.
Would you like more detailed guidance on implementing this in your specific context?
Sources:
- LLMStream Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 8:59 PMgray-monkey-83525
02/25/2025, 8:59 PMChatChunk
. According to the LiveKit documentation, a ChatChunk
consists of a request_id
, a list of Choice
objects, and optional CompletionUsage
data.
Here's how you can format your LLM response:
1. Structure the Response: Your LLM response should be structured as a ChatChunk
object. This includes a request_id
to uniquely identify the request, a list of Choice
objects that contain the response content, and optionally, usage statistics.
2. Create Choice
Objects: Each Choice
object should include a ChoiceDelta
with the role (e.g., "assistant") and the content of the response.
3. Include Usage Data: If applicable, include CompletionUsage
data to provide information about token usage.
Here's an example of how you might format the response:
from livekit.agents.llm import ChatChunk, Choice, ChoiceDelta, CompletionUsage
def format_llm_response(request_id, response_content):
# Create a ChoiceDelta with the response content
choice_delta = ChoiceDelta(role="assistant", content=response_content)
# Create a Choice object
choice = Choice(delta=choice_delta)
# Optionally, include usage data
usage = CompletionUsage(completion_tokens=10, prompt_tokens=5, total_tokens=15)
# Create a ChatChunk with the request_id, choices, and usage
chat_chunk = ChatChunk(request_id=request_id, choices=[choice], usage=usage)
return chat_chunk
# Example usage
formatted_response = format_llm_response("unique_request_id", "This is the response content.")
Explanation
- ChatChunk: This is the main object that encapsulates the response. It includes a request_id
, a list of Choice
objects, and optional CompletionUsage
.
- Choice and ChoiceDelta: These objects represent individual response options. The ChoiceDelta
contains the role and content of the response.
- CompletionUsage: This optional data provides information about the number of tokens used in the response.
By formatting your LLM response in this way, you ensure compatibility with LiveKit's processing requirements.
Would you like more detailed guidance on implementing this in your specific context?
Sources:
- LiveKit LLM Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 9:06 PMgray-monkey-83525
02/25/2025, 9:06 PMChatChunk
in LiveKit, you need to structure the response to include a request_id
, a list of Choice
objects, and optionally, CompletionUsage
data. Here's a complete example:
from livekit.agents.llm import ChatChunk, Choice, ChoiceDelta, CompletionUsage
# Example function to create a ChatChunk response
def create_chat_chunk_example():
# Unique identifier for the request
request_id = "example_request_id"
# Content of the response
response_content = "This is an example response from the assistant."
# Create a ChoiceDelta with the response content
choice_delta = ChoiceDelta(role="assistant", content=response_content)
# Create a Choice object
choice = Choice(delta=choice_delta)
# Optionally, include usage data
usage = CompletionUsage(completion_tokens=10, prompt_tokens=5, total_tokens=15)
# Create a ChatChunk with the request_id, choices, and usage
chat_chunk = ChatChunk(request_id=request_id, choices=[choice], usage=usage)
return chat_chunk
# Generate the example ChatChunk
example_chat_chunk = create_chat_chunk_example()
# Print the example ChatChunk
print(example_chat_chunk)
Explanation
- ChatChunk: This object encapsulates the entire response. It includes a request_id
to uniquely identify the request, a list of Choice
objects, and optional CompletionUsage
data.
- Choice and ChoiceDelta: These objects represent the response options. The ChoiceDelta
contains the role (e.g., "assistant") and the content of the response.
- CompletionUsage: This optional data provides information about the number of tokens used in the response, which can be useful for tracking and billing purposes.
This example demonstrates how to structure a response in a way that is compatible with LiveKit's processing requirements. If you need further customization or have specific requirements, feel free to ask for more details.
Sources:
- https://docs.livekit.io/python/livekit/plugins/anthropic/index.html#livekit.plugins.anthropic.LLMStream 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 9:10 PMgray-monkey-83525
02/25/2025, 9:10 PMChatChunk
response, you need to convert the ChatChunk
and its components (Choice
, ChoiceDelta
, and CompletionUsage
) into a JSON-serializable format. Here's how you can do it:
import json
from livekit.agents.llm import ChatChunk, Choice, ChoiceDelta, CompletionUsage
# Example function to create a ChatChunk response
def create_chat_chunk_example():
# Unique identifier for the request
request_id = "example_request_id"
# Content of the response
response_content = "This is an example response from the assistant."
# Create a ChoiceDelta with the response content
choice_delta = ChoiceDelta(role="assistant", content=response_content)
# Create a Choice object
choice = Choice(delta=choice_delta)
# Optionally, include usage data
usage = CompletionUsage(completion_tokens=10, prompt_tokens=5, total_tokens=15)
# Create a ChatChunk with the request_id, choices, and usage
chat_chunk = ChatChunk(request_id=request_id, choices=[choice], usage=usage)
return chat_chunk
# Convert the ChatChunk to a JSON-serializable dictionary
def chat_chunk_to_dict(chat_chunk):
return {
"request_id": chat_chunk.request_id,
"choices": [
{
"delta": {
"role": choice.delta.role,
"content": choice.delta.content
}
} for choice in chat_chunk.choices
],
"usage": {
"completion_tokens": chat_chunk.usage.completion_tokens,
"prompt_tokens": chat_chunk.usage.prompt_tokens,
"total_tokens": chat_chunk.usage.total_tokens
} if chat_chunk.usage else None
}
# Generate the example ChatChunk
example_chat_chunk = create_chat_chunk_example()
# Convert to JSON
example_chat_chunk_json = json.dumps(chat_chunk_to_dict(example_chat_chunk), indent=4)
# Print the JSON representation
print(example_chat_chunk_json)
Explanation
- chat_chunk_to_dict: This function converts a ChatChunk
object into a dictionary that can be serialized to JSON. It extracts the request_id
, choices
, and usage
data.
- JSON Serialization: The json.dumps
function is used to convert the dictionary into a JSON string, with indent=4
for pretty-printing.
This JSON representation can be used to send the response over a network or store it in a database. If you need further customization, feel free to ask for more details.
Sources:
- LiveKit LLM Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.