How can I play a custom audio within LLM node ``` async def LiveKit Community #ask-ai

How can I play a custom audio within LLM node? ``...

able-jackal-84140

07/13/2025, 4:17 PM

How can I play a custom audio within LLM node?

Copy code

async def llm_node(
        self,
        chat_ctx: llm.ChatContext,
        tools: list[FunctionTool],
        model_settings: ModelSettings
    ) -> AsyncIterable[llm.ChatChunk]:
        # Insert custom preprocessing here
        async for chunk in Agent.default.llm_node(self, chat_ctx, tools, model_settings):
            # Insert custom postprocessing here
            if "कृपया" in chunk.delta.content: 
                <http://logger.info|logger.info>("\n\n\n TOOL IS HAPPENING \n\n\n")       
                # await self.session.say("Playing audio", audio=audio_generator("waiting_message_hindi.wav"))
                # continue

            <http://logger.info|logger.info>(f"This is the current chunks coming: {chunk}")
            yield chunk

tall-airline-16756

07/13/2025, 4:17 PM

Thanks for reaching out, Aditya. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

07/13/2025, 4:17 PM

You can play a custom audio file within an LLM node by using the

session.say()

method along with a custom

audio

parameter. This approach allows you to bypass TTS and directly send a pre-synthesized audio file (such as a WAV file) to the user. Here's a step-by-step approach suitable for your setup:

Copy code

async def play_audio_file(self):
    audio_path = Path(__file__).parent / "audio.wav"
    with wave.open(str(audio_path), 'rb') as wav_file:
        num_channels = wav_file.getnchannels()
        sample_rate = wav_file.getframerate()
        frames = wav_file.readframes(wav_file.getnframes())
    audio_frame = rtc.AudioFrame(
        data=frames,
        sample_rate=sample_rate,
        num_channels=num_channels,
        samples_per_channel=wav_file.getnframes()
    )
    async def audio_generator():
        yield audio_frame
    await self.session.say("Playing audio", audio=audio_generator())

You can trigger this logic from within your

llm_node

when conditions are met (such as specific text being detected in a chunk). Alternatively, for more advanced control (background playback, stop, loop), the

BackgroundAudioPlayer

class can be used. Let me know if you want the details for that method. Sources: basics/playing_audio.py | Agent speech and audio | LiveKit Docs | examples/voice_agents/error_callback.py | meditation_assistant/README.mdoc

able-jackal-84140

07/13/2025, 4:18 PM

give me the complete example

tall-airline-16756

07/13/2025, 4:18 PM

Hey there! I've got a cool example for you on how to play a custom WAV audio file right inside your

llm_node

using

session.say()

in a LiveKit agent. It's a neat trick for replacing part of your output stream with a custom audio response when a specific phrase pops up.

Copy code

import wave
from pathlib import Path
from livekit import rtc
from livekit.agents import llm

async def llm_node(self, chat_ctx: llm.ChatContext, tools: list, model_settings):
    async for chunk in Agent.default.llm_node(self, chat_ctx, tools, model_settings):
        if "कृपया" in chunk.delta.content:
            # Log for debugging
            <http://logger.info|logger.info>("\n\n\n TOOL IS HAPPENING \n\n\n")
            # Path to your audio file
            audio_path = Path(__file__).parent / "waiting_message_hindi.wav"
            # Read and prepare the audio file
            with wave.open(str(audio_path), 'rb') as wav_file:
                num_channels = wav_file.getnchannels()
                sample_rate = wav_file.getframerate()
                frames = wav_file.readframes(wav_file.getnframes())
                samples_per_channel = wav_file.getnframes()
            audio_frame = rtc.AudioFrame(
                data=frames,
                sample_rate=sample_rate,
                num_channels=num_channels,
                samples_per_channel=samples_per_channel
            )
            # Create an async generator for the frame
            async def audio_generator():
                yield audio_frame
            # Say the custom audio
            await self.session.say("Playing audio", audio=audio_generator())
            continue  # Optionally continue or not, based on your flow
        <http://logger.info|logger.info>(f"This is the current chunks coming: {chunk}")
        yield chunk

• Here's the deal: this code checks each chunk for the trigger phrase

कृपया

. When it matches, it reads and plays the custom WAV file using

session.say()

. The custom audio will replace the default text-to-speech. • Just swap out

waiting_message_hindi.wav

with your own audio file path. • Feel free to tweak the logic to fit your specific needs or add more triggers and conditions. Want to explore background playback or more advanced audio control? I'm happy to help you out!

able-jackal-84140

07/13/2025, 4:21 PM

This is not working the audio is not being played

tall-airline-16756

07/13/2025, 4:22 PM

Hey there! Having trouble getting your audio to play with

session.say(audio=...)

? No worries—I've got some friendly troubleshooting tips to help you out! • TTS Plugin Required: Let's first make sure you've got a Text-to-Speech plugin set up. Think of it like having the right microphone for a podcast—the

say()

method needs its TTS model to work. If you don't have one, you'll want to use

generate_reply()

or double-check that your session's TTS is fully ready to go. • Valid WAV Format: Audio can be a bit picky! Your WAV file needs to match the right specs—typically 48kHz, mono audio. It's like making sure your favorite playlist is set to the right audio settings. • Audio Frame Construction: Here's a quick way to set up your audio frame that should do the trick:

Copy code

audio_frame = rtc.AudioFrame(
    data=frames,
    sample_rate=sample_rate,  # ideally 48000
    num_channels=num_channels,  # ideally 1
    samples_per_channel=wav_file.getnframes()
)

• Session State: Just a quick check—is your agent session up and running smoothly? Make sure it's not stuck or in an error state. • Queue/Buffers: Seeing some audio hiccups? Try increasing your audio source queue size to 1000ms or larger. It's like giving your audio a bit more breathing room. • Logging: Take a peek at those agent and backend logs. They're like detective notes that can help us track down any sneaky audio issues. Could you help me confirm a few things? • Is your session's TTS plugin active and ready to go? • Is your audio file mono and 48kHz? • Do you have any logs from when you tried to play the audio? Don't worry—we'll get to the bottom of this together! Sources: Speech & Audio Processing | livekit/livekit_composite | DeepWiki | Agent speech and audio | LiveKit Docs | tests/test_tts.py | basics/playing_audio.py | agent_extensions/agent_extensions/utils/wav_player.py

able-jackal-84140

07/13/2025, 4:23 PM

I am passing the TTS through session

tall-airline-16756

07/13/2025, 4:23 PM

Hey there! I'd love to help you troubleshoot this. Could you double-check if your WAV audio file is mono (just one channel) and running at 48kHz? And while you're at it, take a quick peek at your logs - are there any warning or error messages popping up when you try to play the custom audio? I'm here to walk you through this!

able-jackal-84140

07/13/2025, 4:31 PM

its mono

tall-airline-16756

07/13/2025, 4:31 PM

Hey there! I'm wondering if you've checked the sample rate of your WAV file. Is it set to 48kHz (48000 Hz)? Also, have you taken a look at your logs to see if any error or warning messages pop up when you're trying to play the audio? I'm happy to help you troubleshoot this!

Open in Slack

Previous Next