how to add thinking audio only when tool calling LiveKit Community #ask-ai

Join Slack

how to add thinking audio only when tool calling

# ask-ai

narrow-answer-89593

05/08/2025, 4:11 PM

how to add thinking audio only when tool calling

tall-airline-16756

05/08/2025, 4:11 PM

Thanks for reaching out, Sebastian. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

05/08/2025, 4:12 PM

To add "thinking" audio that plays only during tool calls, you can use the

BackgroundAudioPlayer

with a

thinking_sound

configuration. Below is an example implementation:

Copy code

from livekit.agents import BackgroundAudioPlayer, AudioConfig, BuiltinAudioClip

async def entrypoint(ctx: agents.JobContext):
    await ctx.connect()

    session = AgentSession(
        # ... stt, llm, tts, vad, turn_detection, etc.
    )

    await session.start(
        room=ctx.room,
        # ... agent, etc.
    )
    background_audio = BackgroundAudioPlayer(
        thinking_sound=[
            AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING, volume=0.8),
            AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING2, volume=0.7),
        ],
    )
    await background_audio.start(room=ctx.room, agent_session=session)

The

BackgroundAudioPlayer

will automatically play the configured thinking sounds when the agent's state changes to "thinking" during tool calls, and will stop the sounds when the state changes back. Sources: Handling RAG Delays in Voice Agents | LiveKit Docs | External data and RAG | LiveKit Docs

narrow-answer-89593

05/08/2025, 4:12 PM

but only when tool calling

tall-airline-16756

05/08/2025, 4:12 PM

To ensure the thinking audio plays only during tool calls, you can use the

BackgroundAudioPlayer

in conjunction with the agent session. The agent session tracks the agent's state, including when it is "thinking" during tool calls. Below is an example implementation:

Copy code

from livekit.agents import BackgroundAudioPlayer, AudioConfig, BuiltinAudioClip

async def entrypoint(ctx: agents.JobContext):
    await ctx.connect()

    session = AgentSession(
        # ... stt, llm, tts, vad, turn_detection, etc.
    )

    await session.start(
        room=ctx.room,
        # ... agent, etc.
    )
    
    background_audio = BackgroundAudioPlayer(
        thinking_sound=[
            AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING, volume=0.8),
            AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING2, volume=0.7),
        ],
    )
    await background_audio.start(room=ctx.room, agent_session=session)

The

BackgroundAudioPlayer

will automatically play the thinking sound when the agent's state changes to "thinking" during tool calls, and stop the sound when the state changes to something else. Sources: External data and RAG | LiveKit Docs | Tree View: | dse-workbench

narrow-answer-89593

05/08/2025, 4:13 PM

yeah but it plays the audio even when not tool calling

tall-airline-16756

05/08/2025, 4:13 PM

To ensure the thinking audio plays only during tool calls, you need to configure the

BackgroundAudioPlayer

to specifically track the agent's state during tool calls. The

thinking_sound

should only activate when the agent's state is set to "thinking" during these calls. Below is the implementation:

Copy code

from livekit.agents import BackgroundAudioPlayer, AudioConfig, BuiltinAudioClip

async def entrypoint(ctx: agents.JobContext):
    await ctx.connect()

    session = AgentSession(
        # ... stt, llm, tts, vad, turn_detection, etc.
    )

    await session.start(
        room=ctx.room,
        # ... agent, etc.
    )
    background_audio = BackgroundAudioPlayer(
        thinking_sound=[
            AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING, volume=0.8),
            AudioConfig(BuiltinAudioClip.KEYBOARD_TYPING2, volume=0.7),
        ],
    )
    await background_audio.start(room=ctx.room, agent_session=session)

This setup ensures that the thinking audio is tied to the "thinking" state, which occurs during tool calls. If the audio is still playing outside of tool calls, please verify the agent's state transitions and ensure they are correctly set to "thinking" only during tool calls. Sources: External data and RAG | LiveKit Docs | Handling RAG Delays in Voice Agents | LiveKit Docs

2 Views

Open in Slack

Previous Next