How can I play a custom audio within LLM node? ``...
# ask-ai
a
How can I play a custom audio within LLM node?
Copy code
async def llm_node(
        self,
        chat_ctx: llm.ChatContext,
        tools: list[FunctionTool],
        model_settings: ModelSettings
    ) -> AsyncIterable[llm.ChatChunk]:
        # Insert custom preprocessing here
        async for chunk in Agent.default.llm_node(self, chat_ctx, tools, model_settings):
            # Insert custom postprocessing here
            if "कृपया" in chunk.delta.content: 
                <http://logger.info|logger.info>("\n\n\n TOOL IS HAPPENING \n\n\n")       
                # await self.session.say("Playing audio", audio=audio_generator("waiting_message_hindi.wav"))
                # continue

            <http://logger.info|logger.info>(f"This is the current chunks coming: {chunk}")
            yield chunk
t
Thanks for reaching out, Aditya. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
You can play a custom audio file within an LLM node by using the
session.say()
method along with a custom
audio
parameter. This approach allows you to bypass TTS and directly send a pre-synthesized audio file (such as a WAV file) to the user. Here's a step-by-step approach suitable for your setup:
Copy code
async def play_audio_file(self):
    audio_path = Path(__file__).parent / "audio.wav"
    with wave.open(str(audio_path), 'rb') as wav_file:
        num_channels = wav_file.getnchannels()
        sample_rate = wav_file.getframerate()
        frames = wav_file.readframes(wav_file.getnframes())
    audio_frame = rtc.AudioFrame(
        data=frames,
        sample_rate=sample_rate,
        num_channels=num_channels,
        samples_per_channel=wav_file.getnframes()
    )
    async def audio_generator():
        yield audio_frame
    await self.session.say("Playing audio", audio=audio_generator())
You can trigger this logic from within your
llm_node
when conditions are met (such as specific text being detected in a chunk). Alternatively, for more advanced control (background playback, stop, loop), the
BackgroundAudioPlayer
class can be used. Let me know if you want the details for that method. Sources: basics/playing_audio.py | Agent speech and audio | LiveKit Docs | examples/voice_agents/error_callback.py | meditation_assistant/README.mdoc
a
give me the complete example
t
Hey there! I've got a cool example for you on how to play a custom WAV audio file right inside your
llm_node
using
session.say()
in a LiveKit agent. It's a neat trick for replacing part of your output stream with a custom audio response when a specific phrase pops up.
Copy code
import wave
from pathlib import Path
from livekit import rtc
from livekit.agents import llm

async def llm_node(self, chat_ctx: llm.ChatContext, tools: list, model_settings):
    async for chunk in Agent.default.llm_node(self, chat_ctx, tools, model_settings):
        if "कृपया" in chunk.delta.content:
            # Log for debugging
            <http://logger.info|logger.info>("\n\n\n TOOL IS HAPPENING \n\n\n")
            # Path to your audio file
            audio_path = Path(__file__).parent / "waiting_message_hindi.wav"
            # Read and prepare the audio file
            with wave.open(str(audio_path), 'rb') as wav_file:
                num_channels = wav_file.getnchannels()
                sample_rate = wav_file.getframerate()
                frames = wav_file.readframes(wav_file.getnframes())
                samples_per_channel = wav_file.getnframes()
            audio_frame = rtc.AudioFrame(
                data=frames,
                sample_rate=sample_rate,
                num_channels=num_channels,
                samples_per_channel=samples_per_channel
            )
            # Create an async generator for the frame
            async def audio_generator():
                yield audio_frame
            # Say the custom audio
            await self.session.say("Playing audio", audio=audio_generator())
            continue  # Optionally continue or not, based on your flow
        <http://logger.info|logger.info>(f"This is the current chunks coming: {chunk}")
        yield chunk
• Here's the deal: this code checks each chunk for the trigger phrase
कृपया
. When it matches, it reads and plays the custom WAV file using
session.say()
. The custom audio will replace the default text-to-speech. • Just swap out
waiting_message_hindi.wav
with your own audio file path. • Feel free to tweak the logic to fit your specific needs or add more triggers and conditions. Want to explore background playback or more advanced audio control? I'm happy to help you out!
a
This is not working the audio is not being played
t
Hey there! Having trouble getting your audio to play with
session.say(audio=...)
? No worries—I've got some friendly troubleshooting tips to help you out! • TTS Plugin Required: Let's first make sure you've got a Text-to-Speech plugin set up. Think of it like having the right microphone for a podcast—the
say()
method needs its TTS model to work. If you don't have one, you'll want to use
generate_reply()
or double-check that your session's TTS is fully ready to go. • Valid WAV Format: Audio can be a bit picky! Your WAV file needs to match the right specs—typically 48kHz, mono audio. It's like making sure your favorite playlist is set to the right audio settings. • Audio Frame Construction: Here's a quick way to set up your audio frame that should do the trick:
Copy code
audio_frame = rtc.AudioFrame(
    data=frames,
    sample_rate=sample_rate,  # ideally 48000
    num_channels=num_channels,  # ideally 1
    samples_per_channel=wav_file.getnframes()
)
Session State: Just a quick check—is your agent session up and running smoothly? Make sure it's not stuck or in an error state. • Queue/Buffers: Seeing some audio hiccups? Try increasing your audio source queue size to 1000ms or larger. It's like giving your audio a bit more breathing room. • Logging: Take a peek at those agent and backend logs. They're like detective notes that can help us track down any sneaky audio issues. Could you help me confirm a few things? • Is your session's TTS plugin active and ready to go? • Is your audio file mono and 48kHz? • Do you have any logs from when you tried to play the audio? Don't worry—we'll get to the bottom of this together! Sources: Speech &amp; Audio Processing | livekit/livekit_composite | DeepWiki | Agent speech and audio | LiveKit Docs | tests/test_tts.py | basics/playing_audio.py | agent_extensions/agent_extensions/utils/wav_player.py
a
I am passing the TTS through session
t
Hey there! I'd love to help you troubleshoot this. Could you double-check if your WAV audio file is mono (just one channel) and running at 48kHz? And while you're at it, take a quick peek at your logs - are there any warning or error messages popping up when you try to play the custom audio? I'm here to walk you through this!
a
its mono
t
Hey there! I'm wondering if you've checked the sample rate of your WAV file. Is it set to 48kHz (48000 Hz)? Also, have you taken a look at your logs to see if any error or warning messages pop up when you're trying to play the audio? I'm happy to help you troubleshoot this!