i am using livekit and try to intregrate Langgraph...
# ask-ai
p
i am using livekit and try to intregrate Langgraph agent instad of llm but it giving issue
t
Thanks for reaching out, arpan. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Hey there! Could you share a bit more about the trouble you're having with integrating the LangGraph agent with LiveKit? For instance: • Are you seeing any error messages or something acting up unexpectedly? • What have you tried so far to get the integration going? • Are you working with any particular SDKs or APIs for this setup? Once I have a clearer picture, I’ll be able to help you out much better!
p
memory usage reduce in vad
t
Hey there! If you're looking to reduce memory usage in Voice Activity Detection (VAD) within LiveKit, here are some friendly tips you can try out: • Reduce `max_buffered_speech`: Try lowering the max duration of speech that VAD buffers. For instance, going from 60 seconds down to 10 seconds can really help save memory. • Use 8kHz Sample Rate: Switching from 16kHz to 8kHz cuts the number of samples in half, which means less memory and CPU usage. • Adjust `prefix_padding_duration`: Bringing this down a bit (like from 0.5 seconds to 0.25 seconds) can slightly shrink the buffer size. • Enable `force_cpu`: Make sure the VAD runs on the CPU to avoid using up GPU memory. • Reuse VAD Instance: Sharing a single VAD instance across sessions can help prevent extra memory use. • Use Turn Detection Model: You might also consider swapping in or adding a turn detection model, which usually takes up less memory. If you'd like, I’m happy to walk you through any of these steps in more detail. Just let me know! Sources: Silero VAD plugin | LiveKit Docs | livekit-plugins/livekit-plugins-baseten/livekit/plugins/baseten/stt.py | plugins/silero/src/vad.ts | Turn detection and interruptions | LiveKit Docs
p
example code
t
Here’s a handy example code snippet to help you reduce memory usage in VAD by tweaking a few key settings:
Copy code
from livekit.plugins import silero

# Load the VAD with optimized settings
vad = silero.VAD.load(
    max_buffered_speech=10.0,  # Cut down buffered speech to 10 seconds
    sample_rate=8000,          # Use 8kHz sample rate to save memory
    prefix_padding_duration=0.25,  # Shorten padding duration to 0.25 seconds
    force_cpu=True             # Make sure VAD runs on the CPU
)

# Example usage in an agent session
def prewarm(proc):
    proc.userdata["vad"] = vad  # Reuse the same VAD instance across sessions

async def entrypoint(ctx):
    vad_instance = ctx.proc.userdata["vad"]
    session = AgentSession(vad=vad_instance)
    await session.start()
This setup helps keep memory use low without losing any functionality. If you want me to walk you through it or share more examples, just let me know—I’m here to help!
p
can you explain tool calling and parallel tool calling in openai
t
Hey there! Just wanted to share a bit about how tool calling works in OpenAI integrations. It lets the model reach out and use external functions based on what you ask. You control this with the
tool_choice
parameter, which you can set to: • auto: The model decides if it should use tools. • required: The model has to use tools. • none: No tool usage at all. There's also parallel tool calling, which means the model can make several tool calls at the same time—super handy for tackling complex tasks more efficiently! This is controlled by the
parallel_tool_calls
parameter (or
parallelToolCalls
if you're using Node.js). Good news: it’s turned on by default in OpenAI integrations for LiveKit. If you want to set this up, you just adjust these parameters when initializing the LLM. Here's a quick example:
Copy code
from livekit.agents import LLM\n\nllm = LLM.with_openai(api_key="your_api_key",\n    tool_choice="auto",\n    parallel_tool_calls=True\n)\n
If you have any questions or want me to walk you through specific examples, just let me know—I’m here to help! Sources: Fireworks AI LLM integration guide | LiveKit Docs | Cerebras LLM integration guide | LiveKit Docs | Telnyx LLM integration guide | LiveKit Docs | Telnyx LLM integration guide | LiveKit Docs | DeepSeek LLM integration guide | LiveKit Docs
p
can you provide example in python
t
Hey there! Here’s a friendly example to show you how to set up tool calling and parallel tool calling in Python using the LiveKit Agents SDK:
Copy code
from livekit.agents import LLM

# Initialize the LLM with OpenAI integration
llm = LLM.with_openai(
    api_key="your_openai_api_key",  # Just swap this out with your OpenAI API key
    tool_choice="auto",            # The model decides if it wants to use tools
    parallel_tool_calls=True        # Turn on parallel tool calling for efficiency
)

# Here’s how you can make a chat completion call
response = llm.chat(
    messages=[
        {"role": "user", "content": "What is the weather and time in New York?"}
    ]
)

# Let’s see what it says!
print(response.content)
This little example shows you how to get tool calling and parallel tool calling up and running with OpenAI and LiveKit. Just remember to replace
your_openai_api_key
with your actual API key. If you want me to walk you through anything else or need more examples, just give me a shout!
p
how to add web search in openai llm
t
Hey there! If you want to add web search functionality to an OpenAI LLM using LiveKit, you’ve got a couple of great options. You can either create a custom tool like DuckDuckGo or use search-optimized models such as Perplexity. Here’s a simple example of how to set up a custom web search tool:
Copy code
from livekit.agents import function_tool, AgentSession, LLM
from duckduckgo_search import DDGS

# Define a custom web search tool
@function_tool
def search_web(query: str):
    """Perform a web search using DuckDuckGo."""
    ddgs = DDGS()
    results = ddgs.text(query)
    return results[:3]  # Return top 3 results

# Initialize the LLM with the custom tool
llm = LLM.with_openai(
    api_key="your_openai_api_key",
    tool_choice="auto"
)

# Create an agent session with the tool
session = AgentSession(
    llm=llm,
    tools=[search_web]
)

# Example usage
response = session.llm.chat(
    messages=[{"role": "user", "content": "Search for the latest news on AI."}]
)
print(response.content)
Or, if you prefer, you can tap into Perplexity models that come with built-in search capabilities, like this:
Copy code
llm = LLM.with_perplexity(
    api_key="your_perplexity_api_key",
    model="llama-3.1-sonar-small-128k-chat"
)
If you want me to walk you through more examples or have any questions, just let me know—I’m here to help! Sources: Ollama integration guide | LiveKit Docs | OpenAI LLM integration guide | LiveKit Docs | Perplexity LLM integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | OpenAI LLM integration guide | LiveKit Docs