what does this work? How is the _run modules in th...
# ask-ai
a
what does this work? How is the _run modules in the llm is called ?
t
Thanks for reaching out, Aditya. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
The
_run
methods in LLM modules act as the core execution engine for handling LLM interactions within LiveKit. They are called automatically by the
LLMStream
class lifecycle. When an
LLMStream
is initialized, it starts a main task which repeatedly calls
_run()
via the
_main_task()
method. This method manages retries and error handling if API errors occur. Each provider-specific subclass implements its own
_run()
method, which is responsible for: • Sending requests to the specific LLM provider's API • Processing streaming responses and emitting standardized
ChatChunk
events • Handling error recovery and connection issues The
_run
method processes the LLM's streamed response, converts it to LiveKit's format, and communicates through event channels for further action. Error handling and retries are coordinated by the higher-level
_main_task()
wrapper. The entire system is designed for robust streaming, fallback, and recovery across multiple LLM providers. This ensures provider independence and consistent behavior across different LLM integrations. For further technical specifics (e.g., code samples of invocation and provider communication), please request more targeted details. Sources: livekit-agents/livekit/agents/llm/fallback_adapter.py
a
how can I skip this method after for some specific messaging?
t
Hey there! I totally get what you're looking for. From what I can see, the LLMStream classes don't have a built-in way to skip processing certain messages out of the box. But don't worry - I've got a couple of friendly solutions for you! Here are two approaches that might help: • Agent/Pre-LLM Hooks: The
WakeWordHandler
comes with a super handy
before_llm_callback
that agents like
VoicePipelineAgent
use. Think of it like a gatekeeper - if this callback says "nope" (returns
False
), the whole LLM process gets paused for that message. It's like catching things before they even start! • Subclass Override: If you're looking for something more tailored, you can create your own custom LLMStream and tweak the
_run
method. Add some smart checks at the beginning to decide whether you want to process the message or do something different. The best part? Most developers handle this by filtering messages before they even start processing, which keeps things clean and efficient. Want me to walk you through a specific example? Just let me know the details, and I'll help you craft the perfect solution! Sources: pipeline-llm/llm_powered_content_filter.py | agent_extensions/agent_extensions/handlers/wake_word_handler.py | livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/experimental/realti...
a
give example of this
t
Hey there! Let me walk you through a super cool way to use the
before_llm_callback
in a WakeWordHandler that helps you skip LLM inference for specific messages in LiveKit. Trust me, it's easier than it sounds!
Copy code
class WakeWordHandler:
    def __init__(self, wake_word: str = "sam"):
        self._current_state = ListeningState.IDLE
        self._wake_word = wake_word.lower()

    async def before_llm_callback(self, agent, chat_ctx) -> Optional[bool]:
        last_message = chat_ctx.messages[-1] if chat_ctx.messages else None

        if last_message and last_message.role == "user":
            cleaned_message = last_message.content.lower().strip()
            if self._wake_word in cleaned_message and cleaned_message in [self._wake_word, f"{self._wake_word}.", f"{self._wake_word}?"]:
                # If the message is only the wake word, skip LLM inference
                return False
        return None
Connecting with VoicePipelineAgent:
Copy code
wake_word_handler = WakeWordHandler(wake_word="sam")

agent = VoicePipelineAgent(
    vad=...,  # Voice activity detector
    stt=...,  # Speech-to-text
    llm=...,  # LLM instance
    tts=...,  # Text-to-speech
    chat_ctx=initial_ctx,
    before_llm_cb=wake_word_handler.before_llm_callback  # Pass the callback
)

agent.start(ctx.room, participant)
Here's the magic: when someone just says the wake word (like "sam", "sam.", or "sam?"), the
before_llm_callback
returns
False
. This means no LLM processing happens, and the
_run
method takes a break. Pretty neat, right? This approach gives you the flexibility to control exactly which messages trigger LLM inference, all based on the conversation's context. It's like having a smart assistant who knows exactly when to listen and when to stay quiet! Sources: BeforeLLMCallback | LiveKit Agents | chain-of-thought-tts/README.mdoc | VoicePipelineAgent | LiveKit Docs