if i am joining a room in which agent and a teleph...
# ask-ai
b
if i am joining a room in which agent and a telephony partecipant is already present then ,, the agent is not agent is not able to listen to the new partecipant joind via web ,, it is only listening to the telphony partecipant which wal already present . how to make te agent able to listen to the new partecipant .
t
Thanks for reaching out, Aman. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Hey there! To make sure the agent can hear the new participant joining via the web, let's double-check a few things together: • Audio Rendering for Web Participants: Make sure the web participant’s audio is coming through clearly. You can use LiveKit’s
RoomAudioRenderer
to automatically play all audio tracks in the room, or if you prefer, add audio tracks manually using
AudioTrack
components or custom
<audio>
tags. • Telephony Integration: It’s important to confirm that the telephony participant’s audio is properly connected and that SIP participants can exchange audio smoothly with everyone else in the room. • Audio Context Management: Check that audio contexts are set up and managed correctly. LiveKit takes care of suspending and resuming audio contexts automatically and sets them up for both local and remote participants when
webAudioMix
is enabled. • Audio Output Device Management: Make sure the agent’s audio output device is configured right so they can hear all participants clearly. If you’re still running into issues, try verifying that the audio tracks from the new web participant are reaching the server and that playback is properly starting for the agent. Feel free to reach out if you want me to walk you through any of these steps—I'm here to help! Sources: lib/src/core/room.dart | Screen sharing | LiveKit Docs | Screen sharing | LiveKit Docs | src/room/Room.ts | LiveKit Docs
b
the telephony partecipant can hear the web partecipant and the web partecipant can also hear the telephony partecipant but the agent is only responding to the telephoney prtecipant
t
It looks like the issue is coming from how the agent's
RoomInputOptions
is set up. If the
participant_kinds
parameter only includes SIP participants (those using telephony), the agent won’t be able to hear web participants. To fix this, just make sure your
RoomInputOptions
includes both
PARTICIPANT_KIND_SIP
(for telephony users) and
PARTICIPANT_KIND_STANDARD
(for web users). Here’s a quick example to help you out:
Copy code
await session.start(
    agent=MyAgent(),
    room=ctx.room,
    room_input_options=RoomInputOptions(
        participant_kinds=[
            rtc.ParticipantKind.PARTICIPANT_KIND_SIP,      # Telephony participants
            rtc.ParticipantKind.PARTICIPANT_KIND_STANDARD,  # Web participants
        ]
    ),
)
By the way, if you leave out the
participant_kinds
parameter entirely, it defaults to including both types of participants. If you want to chat more about this or need a hand with anything else, just let me know—I’m here to help! Sources: Using Noise Cancellation for SIP and WebRTC Participants | livekit-agents/livekit/agents/voice/room_io/room_io.py | Room &amp; Participant Management | livekit/livekit_composite | DeepWiki
b
agent = Agent( instructions=prompts.prompt2, tools=[get_current_time,tools.end_call] ) # Configure the voice processing pipeline optimized for telephony session = AgentSession( # Voice Activity Detection llm=google.beta.realtime.RealtimeModel( model="gemini-2.0-flash-exp", voice="Aoede", temperature=0.8, instructions="start with a greeting", ) ) # Start the agent session await session.start(agent=agent, room=ctx.room) above is my current setting .
t
It looks like the issue might be that the
RoomInputOptions
isn’t explicitly set, so it could be defaulting to only include telephony participants. To make sure the agent can hear both web and telephony participants, you’ll want to explicitly include both types in your session configuration. Here’s a friendly tweak you can try in your code:
Copy code
agent = Agent(
    instructions=prompts.prompt2,
    tools=[get_current_time, tools.end_call]
)

# Configure the voice processing pipeline optimized for telephony
session = AgentSession(
    llm=google.beta.realtime.RealtimeModel(
        model="gemini-2.0-flash-exp",
        voice="Aoede",
        temperature=0.8,
        instructions="start with a greeting",
    )
)

# Start the agent session with RoomInputOptions
await session.start(
    agent=agent,
    room=ctx.room,
    room_input_options=RoomInputOptions(
        participant_kinds=[
            rtc.ParticipantKind.PARTICIPANT_KIND_SIP,      # Telephony participants
            rtc.ParticipantKind.PARTICIPANT_KIND_STANDARD,  # Web participants
        ]
    )
)
This way, the agent will be able to listen to both web and telephony participants without missing a thing. If anything’s unclear or you want to chat more about this, just let me know—I’m here to help!
b
also give the import statatments for rtc and roomInputOptions
t
Here are the import statements you'll need for
rtc
and `RoomInputOptions`:
Copy code
from livekit import rtc
from livekit.agent import RoomInputOptions
Just make sure to include these at the top of your script so everything sets up smoothly. If you have any questions or want me to walk you through it, feel free to ask!
b
its still not working
t
I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!