helpful-sugar-58738
02/25/2025, 9:30 PMgray-monkey-83525
02/25/2025, 9:30 PMllm.LLM
and implement the necessary methods to interact with your custom LLM server. This includes setting up the API client and handling requests and responses.
2. Integrate with LiveKit: Use this custom LLM class in your LiveKit agent setup. You can pass it as the llm
parameter to the VoicePipelineAgent
or any other agent that requires LLM integration.
For more detailed guidance, you can refer to the LiveKit documentation on creating custom LLM integrations and the example code provided in the LiveKit community discussions.
Would you like more detailed steps or a code example?
Sources:
- LiveKit documentation
- example code 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 9:31 PMgray-monkey-83525
02/25/2025, 9:31 PMllm.LLM
.
- Implement the chat
method to handle communication with your custom LLM server.
2. Set Up API Client:
- Use an HTTP client like aiohttp
to send requests to your LLM server.
- Ensure you handle authentication and any required headers.
3. Override the chat
Method:
- In the chat
method, send the input text to your LLM server and receive the response.
- Wrap the response in a ChatChunk
object and return it.
4. Integrate with LiveKit:
- Use your custom LLM class in the VoicePipelineAgent
or any other agent setup.
Code Example
Here's a basic example of how you might implement this:
import aiohttp
from livekit.agents import llm
from livekit.agents.pipeline import VoicePipelineAgent
class CustomLLM(llm.LLM):
def __init__(self, api_url, api_key):
super().__init__()
self.api_url = api_url
self.api_key = api_key
self.client = aiohttp.ClientSession()
async def chat(self, *, chat_ctx: llm.ChatContext, fnc_ctx: llm.FunctionContext = None, temperature: float = None, n: int = 1, parallel_tool_calls: bool = None) -> llm.LLMStream:
# Prepare the request payload
payload = {
"input": chat_ctx.messages[-1].text,
"temperature": temperature or 0.5
}
headers = {
"Authorization": f"Bearer {self.api_key}"
}
# Send request to the custom LLM server
async with self.client.post(self.api_url, json=payload, headers=headers) as response:
if response.status == 200:
data = await response.json()
# Create a ChatChunk from the response
chat_chunk = llm.ChatChunk(request_id="custom_llm", choices=[llm.Choice(delta=llm.ChoiceDelta(role="assistant", content=data['output']))])
return llm.LLMStream(chat_chunk)
else:
raise Exception(f"Failed to get response from LLM server: {response.status}")
async def aclose(self):
await self.client.close()
# Usage in a VoicePipelineAgent
custom_llm = CustomLLM(api_url="https://your-llm-server.com/api", api_key="your_api_key")
agent = VoicePipelineAgent(
vad=silero.VAD.load(),
stt=deepgram.STT(model="nova-2-general"),
llm=custom_llm,
tts=cartesia.TTS(),
chat_ctx=llm.ChatContext().append(role="system", text="Your initial prompt"),
)
gray-monkey-83525
02/25/2025, 9:31 PMaiohttp
to send requests and receive responses.
- chat Method: This method sends the latest message from the ChatContext
to the LLM server and processes the response.
- Integration: The CustomLLM
instance is used in the VoicePipelineAgent
, allowing it to process conversations using your custom LLM.
This setup allows you to integrate any LLM server that provides an API, giving you flexibility in how you handle language processing.
Sources:
- Example code 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 10:50 PMgray-monkey-83525
02/25/2025, 10:50 PMLLMStream
class is abstract and requires an implementation of its abstract methods before it can be instantiated. You need to create a subclass of LLMStream
and implement the required abstract methods.
Step-by-Step Solution
1. Create a Subclass of LLMStream
:
- Define a new class that inherits from llm.LLMStream
.
- Implement the abstract methods, such as _main_task
.
2. Implement the _main_task
Method:
- This method should handle the logic for processing the LLM response stream.
Code Example
Here's how you can modify the code to include a custom LLMStream
subclass:
import aiohttp
from livekit.agents import llm
from livekit.agents.pipeline import VoicePipelineAgent
class CustomLLMStream(llm.LLMStream):
def __init__(self, llm, chat_ctx, fnc_ctx, response_content):
super().__init__(llm=llm, chat_ctx=chat_ctx, fnc_ctx=fnc_ctx)
self.response_content = response_content
async def _main_task(self):
# Send the response content as a ChatChunk
chat_chunk = llm.ChatChunk(
request_id="custom_llm",
choices=[llm.Choice(delta=llm.ChoiceDelta(role="assistant", content=self.response_content))]
)
self._event_ch.send_nowait(chat_chunk)
class CustomLLM(llm.LLM):
def __init__(self, api_url, api_key):
super().__init__()
self.api_url = api_url
self.api_key = api_key
self.client = aiohttp.ClientSession()
async def chat(self, *, chat_ctx: llm.ChatContext, fnc_ctx: llm.FunctionContext = None, temperature: float = None, n: int = 1, parallel_tool_calls: bool = None) -> llm.LLMStream:
# Prepare the request payload
payload = {
"input": chat_ctx.messages[-1].text,
"temperature": temperature or 0.5
}
headers = {
"Authorization": f"Bearer {self.api_key}"
}
# Send request to the custom LLM server
async with self.client.post(self.api_url, json=payload, headers=headers) as response:
if response.status == 200:
data = await response.json()
# Return a CustomLLMStream instance
return CustomLLMStream(self, chat_ctx, fnc_ctx, data['output'])
else:
raise Exception(f"Failed to get response from LLM server: {response.status}")
async def aclose(self):
await self.client.close()
# Usage in a VoicePipelineAgent
custom_llm = CustomLLM(api_url="https://your-llm-server.com/api", api_key="your_api_key")
agent = VoicePipelineAgent(
vad=silero.VAD.load(),
stt=deepgram.STT(model="nova-2-general"),
llm=custom_llm,
tts=cartesia.TTS(),
chat_ctx=llm.ChatContext().append(role="system", text="Your initial prompt"),
)
gray-monkey-83525
02/25/2025, 10:51 PMLLMStream
implements the _main_task
method, which sends the response content as a ChatChunk
.
- chat Method: The chat
method in CustomLLM
now returns an instance of CustomLLMStream
, ensuring that the abstract methods are implemented.
This approach should resolve the error and allow you to use your custom LLM server with LiveKit.
Sources:
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1732699668.482419 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.helpful-sugar-58738
02/25/2025, 10:52 PMgray-monkey-83525
02/25/2025, 10:52 PMLLMStream
is an abstract class and cannot be instantiated directly. You need to create a subclass of LLMStream
and implement the abstract method _main_task
.
Solution
1. Create a Subclass of LLMStream
:
- Define a new class that inherits from llm.LLMStream
.
- Implement the _main_task
method, which is required by the abstract class.
2. Implement the _main_task
Method:
- This method should handle the logic for processing the LLM response stream.
Code Example
Here's how you can modify the code to include a custom LLMStream
subclass:
import aiohttp
from livekit.agents import llm
from livekit.agents.pipeline import VoicePipelineAgent
class CustomLLMStream(llm.LLMStream):
def __init__(self, llm, chat_ctx, fnc_ctx, response_content):
super().__init__(llm=llm, chat_ctx=chat_ctx, fnc_ctx=fnc_ctx)
self.response_content = response_content
async def _main_task(self):
# Send the response content as a ChatChunk
chat_chunk = llm.ChatChunk(
request_id="custom_llm",
choices=[llm.Choice(delta=llm.ChoiceDelta(role="assistant", content=self.response_content))]
)
self._event_ch.send_nowait(chat_chunk)
class CustomLLM(llm.LLM):
def __init__(self, api_url, api_key):
super().__init__()
self.api_url = api_url
self.api_key = api_key
self.client = aiohttp.ClientSession()
async def chat(self, *, chat_ctx: llm.ChatContext, fnc_ctx: llm.FunctionContext = None, temperature: float = None, n: int = 1, parallel_tool_calls: bool = None) -> llm.LLMStream:
# Prepare the request payload
payload = {
"input": chat_ctx.messages[-1].text,
"temperature": temperature or 0.5
}
headers = {
"Authorization": f"Bearer {self.api_key}"
}
# Send request to the custom LLM server
async with self.client.post(self.api_url, json=payload, headers=headers) as response:
if response.status == 200:
data = await response.json()
# Return a CustomLLMStream instance
return CustomLLMStream(self, chat_ctx, fnc_ctx, data['output'])
else:
raise Exception(f"Failed to get response from LLM server: {response.status}")
async def aclose(self):
await self.client.close()
# Usage in a VoicePipelineAgent
custom_llm = CustomLLM(api_url="https://your-llm-server.com/api", api_key="your_api_key")
agent = VoicePipelineAgent(
vad=silero.VAD.load(),
stt=deepgram.STT(model="nova-2-general"),
llm=custom_llm,
tts=cartesia.TTS(),
chat_ctx=llm.ChatContext().append(role="system", text="Your initial prompt"),
)
gray-monkey-83525
02/25/2025, 10:53 PMLLMStream
implements the _main_task
method, which sends the response content as a ChatChunk
.
- chat Method: The chat
method in CustomLLM
now returns an instance of CustomLLMStream
, ensuring that the abstract methods are implemented.
This approach should resolve the error and allow you to use your custom LLM server with LiveKit.
Sources:
- LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.