i am using livekit and try to intregrate Langgraph agent ins LiveKit Community #ask-ai

Join Slack

i am using livekit and try to intregrate Langgraph...

# ask-ai

polite-dog-12152

08/29/2025, 5:29 AM

i am using livekit and try to intregrate Langgraph agent instad of llm but it giving issue

tall-airline-16756

08/29/2025, 5:29 AM

Thanks for reaching out, arpan. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

08/29/2025, 5:29 AM

Hey there! Could you share a bit more about the trouble you're having with integrating the LangGraph agent with LiveKit? For instance: • Are you seeing any error messages or something acting up unexpectedly? • What have you tried so far to get the integration going? • Are you working with any particular SDKs or APIs for this setup? Once I have a clearer picture, I’ll be able to help you out much better!

polite-dog-12152

09/07/2025, 9:04 AM

memory usage reduce in vad

tall-airline-16756

09/07/2025, 9:04 AM

Hey there! If you're looking to reduce memory usage in Voice Activity Detection (VAD) within LiveKit, here are some friendly tips you can try out: • Reduce `max_buffered_speech`: Try lowering the max duration of speech that VAD buffers. For instance, going from 60 seconds down to 10 seconds can really help save memory. • Use 8kHz Sample Rate: Switching from 16kHz to 8kHz cuts the number of samples in half, which means less memory and CPU usage. • Adjust `prefix_padding_duration`: Bringing this down a bit (like from 0.5 seconds to 0.25 seconds) can slightly shrink the buffer size. • Enable `force_cpu`: Make sure the VAD runs on the CPU to avoid using up GPU memory. • Reuse VAD Instance: Sharing a single VAD instance across sessions can help prevent extra memory use. • Use Turn Detection Model: You might also consider swapping in or adding a turn detection model, which usually takes up less memory. If you'd like, I’m happy to walk you through any of these steps in more detail. Just let me know! Sources: Silero VAD plugin | LiveKit Docs | livekit-plugins/livekit-plugins-baseten/livekit/plugins/baseten/stt.py | plugins/silero/src/vad.ts | Turn detection and interruptions | LiveKit Docs

polite-dog-12152

09/07/2025, 9:05 AM

example code

tall-airline-16756

09/07/2025, 9:05 AM

Here’s a handy example code snippet to help you reduce memory usage in VAD by tweaking a few key settings:

Copy code

from livekit.plugins import silero

# Load the VAD with optimized settings
vad = silero.VAD.load(
    max_buffered_speech=10.0,  # Cut down buffered speech to 10 seconds
    sample_rate=8000,          # Use 8kHz sample rate to save memory
    prefix_padding_duration=0.25,  # Shorten padding duration to 0.25 seconds
    force_cpu=True             # Make sure VAD runs on the CPU
)

# Example usage in an agent session
def prewarm(proc):
    proc.userdata["vad"] = vad  # Reuse the same VAD instance across sessions

async def entrypoint(ctx):
    vad_instance = ctx.proc.userdata["vad"]
    session = AgentSession(vad=vad_instance)
    await session.start()

This setup helps keep memory use low without losing any functionality. If you want me to walk you through it or share more examples, just let me know—I’m here to help!

polite-dog-12152

09/08/2025, 12:05 PM

can you explain tool calling and parallel tool calling in openai

tall-airline-16756

09/08/2025, 12:06 PM

Hey there! Just wanted to share a bit about how tool calling works in OpenAI integrations. It lets the model reach out and use external functions based on what you ask. You control this with the

tool_choice

parameter, which you can set to: • auto: The model decides if it should use tools. • required: The model has to use tools. • none: No tool usage at all. There's also parallel tool calling, which means the model can make several tool calls at the same time—super handy for tackling complex tasks more efficiently! This is controlled by the

parallel_tool_calls

parameter (or

parallelToolCalls

if you're using Node.js). Good news: it’s turned on by default in OpenAI integrations for LiveKit. If you want to set this up, you just adjust these parameters when initializing the LLM. Here's a quick example:

Copy code

from livekit.agents import LLM\n\nllm = LLM.with_openai(api_key="your_api_key",\n    tool_choice="auto",\n    parallel_tool_calls=True\n)\n

polite-dog-12152

09/08/2025, 12:07 PM

can you provide example in python

tall-airline-16756

09/08/2025, 12:08 PM

Hey there! Here’s a friendly example to show you how to set up tool calling and parallel tool calling in Python using the LiveKit Agents SDK:

Copy code

from livekit.agents import LLM

# Initialize the LLM with OpenAI integration
llm = LLM.with_openai(
    api_key="your_openai_api_key",  # Just swap this out with your OpenAI API key
    tool_choice="auto",            # The model decides if it wants to use tools
    parallel_tool_calls=True        # Turn on parallel tool calling for efficiency
)

# Here’s how you can make a chat completion call
response = llm.chat(
    messages=[
        {"role": "user", "content": "What is the weather and time in New York?"}
    ]
)

# Let’s see what it says!
print(response.content)

This little example shows you how to get tool calling and parallel tool calling up and running with OpenAI and LiveKit. Just remember to replace

your_openai_api_key

with your actual API key. If you want me to walk you through anything else or need more examples, just give me a shout!

polite-dog-12152

09/08/2025, 12:09 PM

how to add web search in openai llm

tall-airline-16756

09/08/2025, 12:10 PM

Hey there! If you want to add web search functionality to an OpenAI LLM using LiveKit, you’ve got a couple of great options. You can either create a custom tool like DuckDuckGo or use search-optimized models such as Perplexity. Here’s a simple example of how to set up a custom web search tool:

Copy code

from livekit.agents import function_tool, AgentSession, LLM
from duckduckgo_search import DDGS

# Define a custom web search tool
@function_tool
def search_web(query: str):
    """Perform a web search using DuckDuckGo."""
    ddgs = DDGS()
    results = ddgs.text(query)
    return results[:3]  # Return top 3 results

# Initialize the LLM with the custom tool
llm = LLM.with_openai(
    api_key="your_openai_api_key",
    tool_choice="auto"
)

# Create an agent session with the tool
session = AgentSession(
    llm=llm,
    tools=[search_web]
)

# Example usage
response = session.llm.chat(
    messages=[{"role": "user", "content": "Search for the latest news on AI."}]
)
print(response.content)

Or, if you prefer, you can tap into Perplexity models that come with built-in search capabilities, like this:

Copy code

llm = LLM.with_perplexity(
    api_key="your_perplexity_api_key",
    model="llama-3.1-sonar-small-128k-chat"
)

Open in Slack

Previous Next