my simple transcriber agent stops working for all ...
# ask-ai
n
my simple transcriber agent stops working for all users and works only for the last one after some time. very occasionally (once every 10 minutes) it transcribes one word for someone else
g
It seems likely that your transcriber agent is encountering issues with handling multiple users simultaneously. According to LiveKit documentation, LiveKit's architecture is designed around rooms, participants, and tracks, where each participant can publish and subscribe to tracks. However, handling multiple audio tracks from different participants can be challenging, especially if the agent is not configured to manage multiple subscriptions effectively. My understanding is that the agent might be subscribing to only one audio track at a time, which could explain why it only transcribes for the last user. This behavior is consistent with the LiveKit Slack discussion, where it was mentioned that the MultimodalAgent can only subscribe to a single audio track at a time. To address this, you might need to implement a mechanism to switch between active speakers or ensure that only one participant is speaking at a time. Would you like more detailed guidance on how to modify your agent to handle multiple users more effectively? Sources: - LiveKit documentation - LiveKit Slack discussion 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
n
@millions-hair-24956 yes
g
To address the issue of your transcriber agent only working for the last user, you can consider the following steps: 1. Active Speaker Detection: Implement active speaker detection to dynamically switch the audio track that the agent subscribes to. LiveKit provides active speaker detection which can help identify which participant is currently speaking. You can use this information to switch the subscription to the active speaker's audio track. 2. Selective Subscription: Disable
autoSubscribe
and manually control which tracks the agent subscribes to. This can be done by setting
autoSubscribe
to false when connecting to the room and then subscribing to tracks based on your logic. This approach is detailed in the LiveKit documentation. 3. Track Management: Ensure that your agent is capable of handling multiple track subscriptions. You might need to modify the agent to manage multiple audio streams concurrently, possibly by using a queue or buffer to process audio data from different participants. 4. Testing and Debugging: Add logging to your agent to track which audio tracks are being subscribed to and processed. This can help identify if the agent is correctly switching between participants and if there are any issues with the subscription logic. Here's a basic example of how you might implement active speaker detection and selective subscription in JavaScript:
Copy code
import { connect, RoomEvent } from 'livekit-client';

async function setupRoom() {
  const room = await connect('ws://your_host', token, { autoSubscribe: false });

  room.on(RoomEvent.ActiveSpeakersChanged, (speakers) => {
    if (speakers.length > 0) {
      const activeSpeaker = speakers[0];
      const publication = activeSpeaker.getTrackPublication(Track.Source.Microphone);
      if (publication && !publication.isSubscribed) {
        publication.setSubscribed(true);
      }
    }
  });

  room.on(RoomEvent.TrackPublished, (publication, participant) => {
    if (publication.kind === 'audio') {
      publication.setSubscribed(true);
    }
  });
}

setupRoom();
This code sets up a room connection with
autoSubscribe
disabled and subscribes to the active speaker's audio track. It also subscribes to new audio tracks as they are published. Would you like further assistance with implementing these changes? Sources: - LiveKit documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
n
@millions-hair-24956 do you have example for python?
g
Certainly! Here's an example of how you might implement active speaker detection and selective subscription in Python using the LiveKit SDK:
Copy code
import asyncio
from livekit import rtc

async def main():
    # Connect to the room with autoSubscribe set to False
    room = rtc.Room()
    await room.connect('ws://your_host', 'your_token', auto_subscribe=False)

    # Handle active speaker changes
    @room.on('active_speakers_changed')
    def on_active_speakers_changed(speakers):
        if speakers:
            active_speaker = speakers[0]
            publication = active_speaker.get_track_publication(rtc.TrackSource.SOURCE_MICROPHONE)
            if publication and not publication.is_subscribed:
                publication.set_subscribed(True)

    # Handle new track publications
    @room.on('track_published')
    def on_track_published(publication, participant):
        if publication.kind == rtc.TrackKind.KIND_AUDIO:
            publication.set_subscribed(True)

    # Keep the connection alive
    try:
        while True:
            await asyncio.sleep(1)
    except KeyboardInterrupt:
        print("Disconnecting from room...")
        await room.disconnect()

if __name__ == '__main__':
    asyncio.run(main())
Explanation: 1. Connection: The room is connected with
auto_subscribe=False
, allowing manual control over track subscriptions. 2. Active Speaker Detection: The
on_active_speakers_changed
event is used to detect the active speaker. The agent subscribes to the active speaker's microphone track. 3. Track Publication: The
on_track_published
event subscribes to new audio tracks as they are published. This setup allows the agent to dynamically switch subscriptions based on the active speaker, ensuring that it processes audio from the correct participant. Would you like more details or further assistance with this implementation? Sources: - LiveKit documentation 👍 1 button 👎 0 button 💡 Hint: Mention @millions-hair-24956 in the thread for followups.