<@U084APK40NS> I just want to show User in VideoCo...
# ask-ai
s
@millions-hair-24956 I just want to show User in VideoConference not the AI Agent is their a way to do it?
g
To display only the user in a VideoConference and not the AI Agent, you can use selective subscription and track management features. According to LiveKit Docs, you can disable autoSubscribe and manually control which tracks to subscribe to. This allows you to exclude the AI Agent's tracks from being displayed. Would you like more detailed instructions on how to implement this? Sources: - LiveKit Docs πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
It will work in case of multi user joins
@millions-hair-24956
g
To handle multiple users joining a VideoConference while excluding the AI Agent, you can use the selective subscription feature. This involves disabling the
autoSubscribe
option and manually subscribing to tracks of specific participants. You can identify participants by their identity or metadata and choose not to subscribe to the AI Agent's tracks. Would you like a step-by-step guide on how to set this up? Sources: - LiveKit Docs πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 I want to hide screenshare option how I can do that
g
To hide the screen share option in a LiveKit VideoConference, you can customize the
ControlBar
component. According to LiveKit Slack, the
VideoConference
prefab does not provide direct options to control the visibility of the screen share button. However, you can create a custom component by overriding the
ControlBar
to exclude the screen share control. Would you like more detailed instructions on how to implement this? Sources: - LiveKit Slack πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 the slack you share I am not able to open can you share officail docs
g
To hide the screen share option in a LiveKit VideoConference, you can customize the
ControlBar
component. According to the LiveKit ControlBar documentation, you can create a custom
ControlBar
component by using the components and hooks provided by LiveKit. This allows you to exclude specific controls, such as the screen share button, by not including them in your custom implementation. Would you like a step-by-step guide on how to create a custom
ControlBar
? Sources: - LiveKit ControlBar documentation πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 but this is not working with VideoConference
g
The
VideoConference
component in LiveKit is a prefab that includes a default
ControlBar
with screen share functionality. To remove the screen share option, you need to create a custom implementation of the
VideoConference
component. This involves building your own layout using the
LiveKitRoom
component and other building blocks like
GridLayout
and
ControlBar
, excluding the screen share control. Would you like more detailed instructions on how to create a custom video conference setup? Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727222199.167899 - https://livekit-users.slack.com/archives/C07FYCB7FRQ/p1730227871.901309 πŸ‘ 1 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @millions-hair-24956 in the thread for followups.
s
@millions-hair-24956 is there a way I can manage Audio and Video toggle by myself
g
Yes, you can manage audio and video toggles manually using the
TrackToggle
component in LiveKit. According to the LiveKit React Components documentation, the
TrackToggle
component allows you to control the mute and unmute state of your camera and microphone. You can use this component to create custom buttons for toggling audio and video tracks. Would you like more detailed instructions on how to implement this? Sources: - LiveKit React Components documentation πŸ‘ 1 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @millions-hair-24956 in the thread for followups.
s
@millions-hair-24956 Can we change display <ParticipantName></ParticipantName>
g
Yes, you can customize the display of the
<ParticipantName>
component. According to the LiveKit React Components documentation, the
<ParticipantName>
component displays the participant's name within an HTML span element. You can override its styling or wrap it in a custom component to change its appearance. Would you like more detailed instructions on how to customize the
<ParticipantName>
component? Sources: - LiveKit React Components documentation πŸ‘ 1 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @millions-hair-24956 in the thread for followups.
s
is there any way I can enable recording of the whole screen @millions-hair-24956
g
To enable recording of the whole screen in LiveKit, you can use the Egress feature. According to the LiveKit Egress documentation, you can set up automatic recording of a room using Auto Egress. This involves configuring the room creation process to include recording settings, which will capture all participant tracks, including screen shares. Would you like more detailed instructions on setting up Auto Egress for screen recording? Sources: - https://livekit-users.slack.com/archives/C07FYCB7FRQ/p1728857536.918419 - https://livekit-users.slack.com/archives/C07FVFGAUKX/p1731310843.991319 πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
Can you please provide me guide to integrate Egress feature with next js application and code changes which need to be done @millions-hair-24956
g
To integrate the Egress feature with a Next.js application, you can follow these general steps: 1. Set Up LiveKit SDK: Install the LiveKit server SDK in your Next.js application. You can do this by running:
Copy code
npm install livekit-server-sdk
2. Configure Environment Variables: Create a
.env.local
file in your project root with your LiveKit API key, secret, and server URL:
Copy code
LIVEKIT_API_KEY=your_api_key
   LIVEKIT_API_SECRET=your_api_secret
   LIVEKIT_URL=your_livekit_server_url
3. Create an API Route for Egress: In your Next.js application, create an API route to handle the egress setup. For example, create a file at `pages/api/startEgress.js`:
Copy code
import { EgressClient, EncodedFileOutput, EncodedFileType } from 'livekit-server-sdk';

   export default async function handler(req, res) {
     const { roomName } = req.body;
     const egressClient = new EgressClient(process.env.LIVEKIT_URL, process.env.LIVEKIT_API_KEY, process.env.LIVEKIT_API_SECRET);

     const output = new EncodedFileOutput({
       filepath: `/path/to/save/${roomName}.mp4`,
       fileType: EncodedFileType.MP4,
     });

     try {
       const result = await egressClient.startRoomCompositeEgress(roomName, output);
       res.status(200).json({ message: 'Egress started', result });
     } catch (error) {
       res.status(500).json({ error: error.message });
     }
   }
4. Trigger Egress from the Client: Create a function in your Next.js component to call the API route and start the egress:
Copy code
async function startEgress(roomName) {
     const response = await fetch('/api/startEgress', {
       method: 'POST',
       headers: {
         'Content-Type': 'application/json',
       },
       body: JSON.stringify({ roomName }),
     });

     const data = await response.json();
     if (response.ok) {
       console.log('Egress started:', data.result);
     } else {
       console.error('Error starting egress:', data.error);
     }
   }
Would you like more detailed instructions or have specific questions about any of these steps? Sources: - https://livekit-users.slack.com/archives/C07FYCB7FRQ/p1730916991.744359 - https://docs.livekit.io/home/egress/custom-template/#Deploying-your-Application Note: I wasn’t able to find highly relevant data sources, but above are a few potentially relevant links. πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
How to get room name @millions-hair-24956
g
To get the room name in a LiveKit application, you can use the
Room
object's properties. According to the LiveKit Room documentation, the
name
property of the
Room
class provides the name of the room. You can access it as follows:
Copy code
const roomName = room.name;
console.log("Room Name:", roomName);
Would you like more detailed instructions on how to implement this in your application? Sources: - LiveKit Room documentation πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
yes please do share @millions-hair-24956
g
To retrieve the room name in a LiveKit application, follow these steps: 1. Initialize the Room: First, ensure you have a
Room
instance. This is typically done when you connect to a LiveKit room using the LiveKit SDK. 2. Access the Room Name: Once you have the
Room
instance, you can access the room name using the
name
property. Here's a basic example in JavaScript:
Copy code
import { Room } from 'livekit-client';

   async function connectToRoom(wsUrl, token) {
     const room = new Room();
     await room.connect(wsUrl, token);
     console.log('Connected to room:', room.name);
   }

   // Example usage
   const wsUrl = 'wss://your-livekit-server-url';
   const token = 'your-access-token';
   connectToRoom(wsUrl, token);
3. Use the Room Name: You can use the
room.name
property wherever you need to display or log the room name in your application. This approach assumes you have already set up your LiveKit server and have the necessary WebSocket URL and access token. The
Room
instance is created and connected using the
connect
method, after which the room name is accessible via the
name
property. If you need further assistance or have specific questions about integrating this into your application, feel free to ask! Sources: - LiveKit Room documentation πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
Copy code
const result = await egressClient.startRoomCompositeEgress(
      roomName,
      output
    );
In this code I am getting the error that "requested room does not exist" But this roorName I get from the <LiveKitRoom @millions-hair-24956
g
The error "requested room does not exist" typically indicates that the room name you are using in the
startRoomCompositeEgress
function does not match any active room on the LiveKit server. Here are a few steps to troubleshoot and resolve this issue: 1. Verify Room Existence: Ensure that the room is indeed active and exists on the LiveKit server at the time you are calling
startRoomCompositeEgress
. You can use the
listRooms
method from the
RoomServiceClient
to check active rooms. 2. Check Room Name Consistency: Double-check that the
roomName
variable you are passing to
startRoomCompositeEgress
matches exactly with the room name used when creating or joining the room. Any discrepancies in spelling or case sensitivity can cause this error. 3. Room Lifecycle: Ensure that the room is not automatically closed or timed out before you attempt to start the egress. If the room has an
emptyTimeout
set, it might close if no participants are present. 4. Server Logs: Check the LiveKit server logs for any additional information or errors related to room creation or egress requests. Would you like more detailed instructions on any of these steps? Sources: - https://livekit-users.slack.com/archives/C07FVFL1H3P/p1727458126.337679 - https://livekit-users.slack.com/archives/C025KM0S1CK/p1730970200.740959 πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
Copy code
import { useState, useRef, useCallback } from "react";

export const useVideoRecording = () => {
  const [isRecording, setIsRecording] = useState(false);
  const mediaRecorderRef = useRef<MediaRecorder | null>(null);
  const recordedChunksRef = useRef<Blob[]>([]);

  const startRecording = useCallback(async (stream: MediaStream) => {
    recordedChunksRef.current = [];

    try {
      const mediaRecorder = new MediaRecorder(stream, {
        mimeType: "video/webm;codecs=vp9,opus",
      });

      mediaRecorderRef.current = mediaRecorder;

      mediaRecorder.ondataavailable = (event) => {
        if (event.data.size > 0) {
          recordedChunksRef.current.push(event.data);
        }
      };

      mediaRecorder.onstop = () => {
        const blob = new Blob(recordedChunksRef.current, {
          type: "video/webm",
        });

        const url = URL.createObjectURL(blob);
        const a = document.createElement("a");
        a.href = url;
        a.download = `recording-${new Date().toISOString()}.webm`;
        a.click();
        URL.revokeObjectURL(url);
        recordedChunksRef.current = [];
      };

      mediaRecorder.start();
      setIsRecording(true);
    } catch (error) {
      console.error("Error starting recording:", error);
    }
  }, []);

  const stopRecording = useCallback(() => {
    if (!mediaRecorderRef.current || !isRecording) {
      return Promise.resolve(null);
    }

    mediaRecorderRef.current.stop();
    setIsRecording(false);

    return new Promise<Blob>((resolve) => {
      mediaRecorderRef.current!.onstop = () => {
        const videoBlob = new Blob(recordedChunksRef.current, {
          type: "video/webm",
        });
        resolve(videoBlob);
      };
    });
  }, [isRecording]);

  return {
    isRecording,
    startRecording,
    stopRecording,
  };
};
@millions-hair-24956 Can I use this for recording
g
It seems likely that you can use this custom React hook for recording video streams in a web application. The hook,
useVideoRecording
, utilizes the
MediaRecorder
API to capture video from a
MediaStream
and save it as a WebM file. Here’s a brief overview of how it works: 1. State and Refs: The hook uses
useState
to track whether recording is active and
useRef
to store the
MediaRecorder
instance and recorded video chunks. 2. Start Recording: The
startRecording
function initializes the
MediaRecorder
with the provided
MediaStream
and sets up event handlers to collect video data chunks. 3. Stop Recording: The
stopRecording
function stops the
MediaRecorder
and compiles the recorded chunks into a Blob, which can be downloaded as a WebM file. 4. Usage: You can use this hook in a React component by passing a
MediaStream
to
startRecording
and calling
stopRecording
to end the recording. Would you like more detailed instructions or a usage example for this hook? Sources: - https://livekit-users.slack.com/archives/C07FYCB7FRQ/p1731067594.737829 - https://livekit-users.slack.com/archives/C07FYCB7FRQ/p1729099812.700749 πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
My question is this how I can send MediaStream to this function from livekit client sdk @millions-hair-24956
g
To send a
MediaStream
to the
useVideoRecording
hook from the LiveKit client SDK, you can follow these steps: 1. Access the Local Participant's Track: Use the LiveKit SDK to access the local participant's video track. You can do this by calling
getTrackPublication
on the local participant and then accessing the
mediaStreamTrack
. 2. Create a MediaStream: Once you have the
MediaStreamTrack
, you can create a
MediaStream
object from it. 3. Pass the MediaStream to the Hook: Use the
startRecording
function from the
useVideoRecording
hook and pass the
MediaStream
to it. Here's a basic example:
Copy code
import { useVideoRecording } from './useVideoRecording'; // Assuming your hook is in this path
import { useEffect } from 'react';
import { Room } from 'livekit-client';

function MyComponent() {
  const { isRecording, startRecording, stopRecording } = useVideoRecording();

  useEffect(() => {
    const room = new Room();
    // Connect to the room and get the local participant
    room.connect('ws://your-livekit-server-url', 'your-access-token').then(() => {
      const localParticipant = room.localParticipant;
      const videoPublication = localParticipant.getTrackPublication('camera');
      if (videoPublication && videoPublication.track) {
        const mediaStreamTrack = videoPublication.track.mediaStreamTrack;
        const mediaStream = new MediaStream([mediaStreamTrack]);
        startRecording(mediaStream);
      }
    });

    return () => {
      stopRecording();
    };
  }, [startRecording, stopRecording]);

  return (
    <div>
      <button onClick={stopRecording} disabled={!isRecording}>
        Stop Recording
      </button>
    </div>
  );
}
This example assumes you have a LiveKit room set up and connected. It retrieves the local participant's video track, creates a
MediaStream
, and starts recording using the
useVideoRecording
hook. Would you like more detailed instructions or have specific questions about this setup? Sources: - https://docs.livekit.io/client-sdk-js/#md%3Ahandling-common-track-types - https://docs.livekit.io/client-sdk-js/index.html#md%3Ahandling-common-track-types πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 can you help me with this, here is where I am calling the APi
Copy code
import {
  AccessToken,
  AccessTokenOptions,
  VideoGrant,
} from "livekit-server-sdk";
import { NextResponse } from "next/server";

const API_KEY = process.env.LIVEKIT_API_KEY;
const API_SECRET = process.env.LIVEKIT_API_SECRET;
const LIVEKIT_URL = process.env.NEXT_PUBLIC_LIVEKIT_URL;

export const revalidate = 0;

export type ConnectionDetails = {
  serverUrl: string;
  roomName: string;
  participantName: string;
  participantToken: string;
};

export async function GET() {
  try {
    if (LIVEKIT_URL === undefined) {
      throw new Error("LIVEKIT_URL is not defined");
    }
    if (API_KEY === undefined) {
      throw new Error("LIVEKIT_API_KEY is not defined");
    }
    if (API_SECRET === undefined) {
      throw new Error("LIVEKIT_API_SECRET is not defined");
    }

    // Generate participant token
    const participantIdentity = `voice_assistant_user_${Math.floor(
      Math.random() * 10_000
    )}`;
    const roomName = `voice_assistant_room_${Math.floor(
      Math.random() * 10_000
    )}`;
    const participantToken = await createParticipantToken(
      { identity: participantIdentity },
      roomName
    );

    // Return connection details
    const data: ConnectionDetails = {
      serverUrl: LIVEKIT_URL,
      roomName,
      participantToken: participantToken,
      participantName: participantIdentity,
    };
    const headers = new Headers({
      "Cache-Control": "no-store",
    });
    return NextResponse.json(data, { headers });
  } catch (error) {
    if (error instanceof Error) {
      console.error(error);
      return new NextResponse(error.message, { status: 500 });
    }
  }
}

function createParticipantToken(
  userInfo: AccessTokenOptions,
  roomName: string
) {
  const at = new AccessToken(API_KEY, API_SECRET, {
    ...userInfo,
    ttl: "15m",
  });
  const grant: VideoGrant = {
    room: roomName,
    roomJoin: true,
    canPublish: true,
    canPublishData: true,
    canSubscribe: true,
  };
  at.addGrant(grant);
  return at.toJwt();
}
g
The code you provided is a Next.js API route that generates a participant token for a LiveKit room. Here's a breakdown of how it works and some potential improvements: 1. Environment Variables: Ensure that
LIVEKIT_API_KEY
,
LIVEKIT_API_SECRET
, and
NEXT_PUBLIC_LIVEKIT_URL
are correctly set in your environment. These are crucial for generating the token and connecting to the LiveKit server. 2. Token Generation: The
createParticipantToken
function creates an
AccessToken
with the specified identity and room name. It adds a
VideoGrant
to allow the participant to join the room, publish, and subscribe to tracks. 3. Random Room and Participant Names: The room and participant names are generated randomly. This is useful for creating unique sessions but ensure that this aligns with your application's logic. 4. Error Handling: The
GET
function includes error handling to catch and log any issues during token generation. 5. TTL Setting: The
ttl
(time-to-live) for the token is set to "15m" (15 minutes). Ensure this duration fits your use case. Here's a potential improvement: - Room Management: If you need to manage room lifecycles (e.g., creating or deleting rooms), consider using LiveKit's RoomService API to explicitly create rooms before generating tokens. If you encounter any specific issues or errors, please provide more details, and I can assist further. Would you like more detailed instructions or have specific questions about this setup? Sources: - LiveKit Token Generation πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 but my API gives me serverUrl: string; roomName: string; participantName: string; participantToken: string; these info so how could I get mediaStream
g
To obtain a
MediaStream
using the information provided by your API (i.e.,
serverUrl
,
roomName
,
participantName
, and
participantToken
), you need to connect to the LiveKit room using the LiveKit client SDK. Here's a step-by-step guide on how to achieve this: 1. Install LiveKit Client SDK: Ensure you have the LiveKit client SDK installed in your project. You can install it using npm or yarn:
Copy code
npm install livekit-client
2. Connect to the Room: Use the
serverUrl
and
participantToken
to connect to the LiveKit room. Once connected, you can access the local participant's tracks. 3. Access MediaStream: After connecting, you can access the
MediaStream
from the local participant's video or audio tracks. Here's an example of how you can implement this in a React component:
Copy code
import React, { useEffect, useState } from 'react';
import { Room } from 'livekit-client';

function VideoComponent({ serverUrl, participantToken }) {
  const [mediaStream, setMediaStream] = useState(null);

  useEffect(() => {
    const connectToRoom = async () => {
      const room = new Room();
      await room.connect(serverUrl, participantToken);

      // Access the local participant's video track
      const localParticipant = room.localParticipant;
      const videoPublication = localParticipant.getTrackPublication('camera');
      if (videoPublication && videoPublication.track) {
        const mediaStreamTrack = videoPublication.track.mediaStreamTrack;
        const stream = new MediaStream([mediaStreamTrack]);
        setMediaStream(stream);
      }
    };

    connectToRoom();

    return () => {
      // Clean up and disconnect from the room
      room.disconnect();
    };
  }, [serverUrl, participantToken]);

  return (
    <div>
      {mediaStream && (
        <video
          autoPlay
          playsInline
          ref={(videoElement) => {
            if (videoElement) {
              videoElement.srcObject = mediaStream;
            }
          }}
        />
      )}
    </div>
  );
}

export default VideoComponent;
Key Points: - Room Connection: The
Room
object is used to connect to the LiveKit server using the
serverUrl
and
participantToken
. - Track Access: After connecting, the local participant's video track is accessed to create a
MediaStream
. - Video Element: The
MediaStream
is set as the source for a video element to display the video. This example assumes you have a valid
serverUrl
and
participantToken
from your API. Adjust the track source ('camera') as needed based on your setup. If you have further questions or need additional help, feel free to ask! Sources: - https://docs.livekit.io/home/quickstarts/nextjs-12/#4-Make-a-page-in-your-web-app - https://docs.livekit.io/client-sdk-js/ πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 I am getting this error
Copy code
Initial connection failed with ConnectionError: could not establish pc connection. Retrying with another region: <https://coraltalk-8tr5aonx.omumbai1a.production.livekit.cloud> {room: 'voice_assistant_room_4766', roomID: '', participant: 'voice_assistant_user_6652', pID: 'PA_YVd4FhLj3PEj'}
g
The error message "Initial connection failed with ConnectionError: could not establish pc connection" indicates that the client is having trouble establishing a peer-to-peer connection with the LiveKit server. Here are some steps to troubleshoot and resolve this issue: 1. Check Network Connectivity: Ensure that the client device has a stable internet connection. Network issues can prevent successful WebRTC connections. 2. Firewall and NAT Configuration: Verify that the necessary ports are open on any firewalls or NAT devices. LiveKit typically uses UDP ports for WebRTC traffic, and these need to be accessible. Refer to the LiveKit Deployment Guide for details on port configuration. 3. TURN Server Configuration: If the client is behind a restrictive firewall or NAT, ensure that a TURN server is configured and accessible. TURN servers help relay media when direct peer-to-peer connections are not possible. LiveKit includes an embedded TURN server, but it must be properly configured. 4. Server Region: The error message mentions retrying with another region. Ensure that the server region is correctly configured and accessible from the client's location. You may need to adjust the server's region settings or deploy additional servers closer to the client. 5. Check Server Logs: Review the LiveKit server logs for any additional error messages or warnings that might provide more context about the connection failure. 6. SDK Version: Ensure that you are using the latest version of the LiveKit client SDK, as updates may include bug fixes and improvements related to connectivity. If these steps do not resolve the issue, please provide more details about your setup, such as the client environment and any specific network configurations, for further assistance. Sources: - LiveKit Deployment Guide πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 can you please help me in fixing this
Copy code
import { useEffect, useState, useRef } from "react";
import { TranscriptionSegment, RoomEvent } from "livekit-client";
import { useMaybeRoomContext } from "@livekit/components-react";

export default function Transcriptions() {
  const room = useMaybeRoomContext();
  const [transcriptions, setTranscriptions] = useState<{
    [id: string]: TranscriptionSegment;
  }>({});
  const messagesEndRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    if (!room) {
      return;
    }

    const updateTranscriptions = (segments: TranscriptionSegment[]) => {
      setTranscriptions((prev) => {
        const newTranscriptions = { ...prev };
        for (const segment of segments) {
          newTranscriptions[segment.id] = segment;
        }
        return newTranscriptions;
      });
    };

    room.on(RoomEvent.TranscriptionReceived, updateTranscriptions);
    return () => {
      room.off(RoomEvent.TranscriptionReceived, updateTranscriptions);
    };
  }, [room]);

  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
  };

  useEffect(() => {
    scrollToBottom();
  }, [transcriptions]);

  console.log("transcriptions", transcriptions);
  return (
    <div className="w-1/2 p-4 pl-2">
      <div className="bg-gray-800 rounded-lg h-[calc(100vh-160px)] p-6 flex flex-col">
        <h2 className="text-xl mb-4">Live Transcript</h2>
        <ul className="space-y-6 overflow-y-auto flex-1">
          {Object.values(transcriptions)
            .sort((a, b) => a.firstReceivedTime - b.firstReceivedTime)
            .map((segment) => (
              <li key={segment.id} className="space-y-1">
                <div className="flex items-center gap-2 text-gray-400">
                  <span className="text-lg">
                    {segment.id.startsWith("resp") ? "Coral AI" : "You"}
                  </span>
                  <span className="text-sm">
                    β€’{" "}
                    {new Date(segment.firstReceivedTime).toLocaleTimeString(
                      [],
                      {
                        hour: "numeric",
                        minute: "2-digit",
                      }
                    )}
                  </span>
                </div>
                <div className={`p-3 rounded-lg break-words text-xl`}>
                  {segment.text}
                </div>
              </li>
            ))}
          <div ref={messagesEndRef} />
        </ul>
      </div>
    </div>
  );
}
Here I am not able to filter out what is said by participant and agent
g
To differentiate between what is said by the participant and the agent in your transcription display, you can use the
participant
information provided in the
TranscriptionSegment
or the event that delivers these segments. Here's how you can modify your code to achieve this: 1. Access Participant Information: Ensure that the
TranscriptionSegment
or the event provides participant information. If not, you might need to adjust how you handle the
RoomEvent.TranscriptionReceived
event to include participant details. 2. Modify the
updateTranscriptions
Function
: Update the function to store participant information along with the transcription text. Here's an example of how you might modify your code:
Copy code
import { useEffect, useState, useRef } from "react";
import { TranscriptionSegment, RoomEvent } from "livekit-client";
import { useMaybeRoomContext } from "@livekit/components-react";

export default function Transcriptions() {
  const room = useMaybeRoomContext();
  const [transcriptions, setTranscriptions] = useState<{
    [id: string]: { segment: TranscriptionSegment; participant: string };
  }>({});
  const messagesEndRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    if (!room) {
      return;
    }

    const updateTranscriptions = (segments: TranscriptionSegment[], participant) => {
      setTranscriptions((prev) => {
        const newTranscriptions = { ...prev };
        for (const segment of segments) {
          newTranscriptions[segment.id] = { segment, participant };
        }
        return newTranscriptions;
      });
    };

    room.on(RoomEvent.TranscriptionReceived, (segments, participant) => {
      updateTranscriptions(segments, participant.identity);
    });

    return () => {
      room.off(RoomEvent.TranscriptionReceived, updateTranscriptions);
    };
  }, [room]);

  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
  };

  useEffect(() => {
    scrollToBottom();
  }, [transcriptions]);

  console.log("transcriptions", transcriptions);
  return (
    <div className="w-1/2 p-4 pl-2">
      <div className="bg-gray-800 rounded-lg h-[calc(100vh-160px)] p-6 flex flex-col">
        <h2 className="text-xl mb-4">Live Transcript</h2>
        <ul className="space-y-6 overflow-y-auto flex-1">
          {Object.values(transcriptions)
            .sort((a, b) => a.segment.firstReceivedTime - b.segment.firstReceivedTime)
            .map(({ segment, participant }) => (
              <li key={segment.id} className="space-y-1">
                <div className="flex items-center gap-2 text-gray-400">
                  <span className="text-lg">
                    {participant === "agent" ? "Coral AI" : "You"}
                  </span>
                  <span className="text-sm">
                    β€’{" "}
                    {new Date(segment.firstReceivedTime).toLocaleTimeString(
                      [],
                      {
                        hour: "numeric",
                        minute: "2-digit",
                      }
                    )}
                  </span>
                </div>
                <div className={`p-3 rounded-lg break-words text-xl`}>
                  {segment.text}
                </div>
              </li>
            ))}
          <div ref={messagesEndRef} />
        </ul>
      </div>
    </div>
  );
}
Key Changes: - Participant Information: The
updateTranscriptions
function now includes participant information, which is used to determine whether the text is from the agent or the user. - Display Logic: The display logic uses the participant identity to label the text as either "Coral AI" or "You". Ensure that the
RoomEvent.TranscriptionReceived
event provides participant information. If it doesn't, you may need to adjust your event handling to include this data. If you have further questions or need additional help, feel free to ask! Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729597029.187099 - https://livekit-users.slack.com/archives/C07FYCB7FRQ/p1729178825.429829 πŸ‘ 1 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @millions-hair-24956 in the thread for followups.
s
@millions-hair-24956 Is there any way I can change deepgram STT to whisper
Copy code
import logging

from dotenv import load_dotenv
from livekit.agents import (
    AutoSubscribe,
    JobContext,
    JobProcess,
    WorkerOptions,
    cli,
    llm,
    metrics,
)
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import cartesia, openai, deepgram, silero, turn_detector


load_dotenv(dotenv_path=".env.local")
logger = logging.getLogger("voice-agent")


def prewarm(proc: JobProcess):
    proc.userdata["vad"] = silero.VAD.load()


async def entrypoint(ctx: JobContext):
    initial_ctx = llm.ChatContext().append(
        role="system",
        text=(
            "You are an oral exam instructor. Your role is to:"
            "1. Ask questions from the predefined list one at a time"
            "2. Listen to the student's response"
            "3. Move to the next question after receiving the response."
        ),
    )

    <http://logger.info|logger.info>(f"connecting to room {ctx.room.name}")
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)

    # Wait for the first participant to connect
    participant = await ctx.wait_for_participant()
    <http://logger.info|logger.info>(f"starting voice assistant for participant {participant.identity}")

    # This project is configured to use Deepgram STT, OpenAI LLM and Cartesia TTS plugins
    # Other great providers exist like Cerebras, ElevenLabs, Groq, <http://Play.ht|Play.ht>, Rime, and more
    # Learn more and pick the best one for your app:
    # <https://docs.livekit.io/agents/plugins>
    agent = VoicePipelineAgent(
        vad=ctx.proc.userdata["vad"],
        stt=deepgram.STT(),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=cartesia.TTS(),
        turn_detector=turn_detector.EOUModel(),
        # minimum delay for endpointing, used when turn detector believes the user is done with their turn
        min_endpointing_delay=0.5,
        # maximum delay for endpointing, used when turn detector does not believe the user is done with their turn
        max_endpointing_delay=5.0,
        chat_ctx=initial_ctx,
    )

    usage_collector = metrics.UsageCollector()

    @agent.on("metrics_collected")
    def on_metrics_collected(agent_metrics: metrics.AgentMetrics):
        metrics.log_metrics(agent_metrics)
        usage_collector.collect(agent_metrics)

    agent.start(ctx.room, participant)

    # The agent should be polite and greet the user when it joins :)
    await agent.say("Hi there! I'm Coral, your AI instructor for today's oral exam. I'll be asking you several questions about psychology. After each of your responses, I'll move on to the next question. Are you ready to begin?", allow_interruptions=True)


if __name__ == "__main__":
    cli.run_app(
        WorkerOptions(
            entrypoint_fnc=entrypoint,
            prewarm_fnc=prewarm,
        ),
    )
g
Sure thing! I'm looking through the available data to answer your question. Hold on a second... ⏳
s
any update on this @millions-hair-24956
g
To change the speech-to-text (STT) provider from Deepgram to Whisper in your LiveKit setup, you need to modify the
VoicePipelineAgent
configuration to use the Whisper STT plugin instead of the Deepgram plugin. Here's how you can do it: 1. Import the Whisper Plugin: Ensure you have the Whisper plugin available in your LiveKit setup. If you're using OpenAI's Whisper, you can use the OpenAI plugin for STT. 2. Modify the STT Configuration: Replace the Deepgram STT configuration with the Whisper STT configuration in your
VoicePipelineAgent
. Here's an example of how you might modify your code:
Copy code
from livekit.plugins import cartesia, openai, silero, turn_detector

# Other imports remain the same

async def entrypoint(ctx: JobContext):
    initial_ctx = llm.ChatContext().append(
        role="system",
        text=(
            "You are an oral exam instructor. Your role is to:"
            "1. Ask questions from the predefined list one at a time"
            "2. Listen to the student's response"
            "3. Move to the next question after receiving the response."
        ),
    )

    logger.info(f"connecting to room {ctx.room.name}")
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)

    participant = await ctx.wait_for_participant()
    logger.info(f"starting voice assistant for participant {participant.identity}")

    # Change STT from Deepgram to Whisper
    agent = VoicePipelineAgent(
        vad=ctx.proc.userdata["vad"],
        stt=openai.STT(),  # Assuming you're using OpenAI's Whisper
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=cartesia.TTS(),
        turn_detector=turn_detector.EOUModel(),
        min_endpointing_delay=0.5,
        max_endpointing_delay=5.0,
        chat_ctx=initial_ctx,
    )

    usage_collector = metrics.UsageCollector()

    @agent.on("metrics_collected")
    def on_metrics_collected(agent_metrics: metrics.AgentMetrics):
        metrics.log_metrics(agent_metrics)
        usage_collector.collect(agent_metrics)

    agent.start(ctx.room, participant)

    await agent.say("Hi there! I'm Coral, your AI instructor for today's oral exam. I'll be asking you several questions about psychology. After each of your responses, I'll move on to the next question. Are you ready to begin?", allow_interruptions=True)

if __name__ == "__main__":
    cli.run_app(
        WorkerOptions(
            entrypoint_fnc=entrypoint,
            prewarm_fnc=prewarm,
        ),
    )
Key Changes: - STT Plugin: The
stt
parameter in
VoicePipelineAgent
is changed from
deepgram.STT()
to
openai.STT()
, assuming you are using OpenAI's Whisper model. Additional Considerations: - API Keys and Configuration: Ensure that you have the necessary API keys and configuration for the Whisper model you are using. - Plugin Installation: Make sure the Whisper plugin is installed and properly configured in your environment. If you have any specific questions or need further assistance, feel free to ask! πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 can we also remove tts to open ai?
g
Yes, you can change the Text-to-Speech (TTS) provider to OpenAI in your LiveKit setup. To do this, you need to modify the
VoicePipelineAgent
configuration to use the OpenAI TTS plugin. Here's how you can make this change: 1. Import the OpenAI TTS Plugin: Ensure you have the OpenAI TTS plugin available in your LiveKit setup. 2. Modify the TTS Configuration: Replace the current TTS configuration with the OpenAI TTS configuration in your
VoicePipelineAgent
. Here's an example of how you might modify your code:
Copy code
from livekit.plugins import openai, silero, turn_detector

# Other imports remain the same

async def entrypoint(ctx: JobContext):
    initial_ctx = llm.ChatContext().append(
        role="system",
        text=(
            "You are an oral exam instructor. Your role is to:"
            "1. Ask questions from the predefined list one at a time"
            "2. Listen to the student's response"
            "3. Move to the next question after receiving the response."
        ),
    )

    logger.info(f"connecting to room {ctx.room.name}")
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)

    participant = await ctx.wait_for_participant()
    logger.info(f"starting voice assistant for participant {participant.identity}")

    # Change TTS to OpenAI
    agent = VoicePipelineAgent(
        vad=ctx.proc.userdata["vad"],
        stt=openai.STT(),  # Assuming you're using OpenAI's Whisper
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=openai.TTS(),  # Change TTS to OpenAI
        turn_detector=turn_detector.EOUModel(),
        min_endpointing_delay=0.5,
        max_endpointing_delay=5.0,
        chat_ctx=initial_ctx,
    )

    usage_collector = metrics.UsageCollector()

    @agent.on("metrics_collected")
    def on_metrics_collected(agent_metrics: metrics.AgentMetrics):
        metrics.log_metrics(agent_metrics)
        usage_collector.collect(agent_metrics)

    agent.start(ctx.room, participant)

    await agent.say("Hi there! I'm Coral, your AI instructor for today's oral exam. I'll be asking you several questions about psychology. After each of your responses, I'll move on to the next question. Are you ready to begin?", allow_interruptions=True)

if __name__ == "__main__":
    cli.run_app(
        WorkerOptions(
            entrypoint_fnc=entrypoint,
            prewarm_fnc=prewarm,
        ),
    )
Key Changes: - TTS Plugin: The
tts
parameter in
VoicePipelineAgent
is changed to
openai.TTS()
, which uses OpenAI's TTS capabilities. Additional Considerations: - API Keys and Configuration: Ensure that you have the necessary API keys and configuration for the OpenAI TTS model you are using. - Plugin Installation: Make sure the OpenAI TTS plugin is installed and properly configured in your environment. If you have any specific questions or need further assistance, feel free to ask! Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1730965265.297519 - https://docs.livekit.io/agents/plugins/#VAD-and-StreamAdapter πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 when I change tts=cartesia.TTS(), to tts=openai.TTS() it stop working and I have keys added in .env as well
g
If changing
tts=cartesia.TTS()
to
tts=openai.TTS()
causes the TTS functionality to stop working, here are some steps to troubleshoot and resolve the issue: 1. Check API Key: Ensure that the OpenAI API key is correctly set in your
.env
file. The key should be accessible in your application environment. Verify that the key is correctly named and matches the expected environment variable name used by the OpenAI plugin. 2. Verify Environment Variables: Double-check that your environment variables are loaded correctly. You can log the environment variable to ensure it's being read correctly:
Copy code
import os
   print(os.getenv("OPENAI_API_KEY"))
3. Plugin Configuration: Ensure that the OpenAI TTS plugin is correctly configured. The OpenAI TTS might require specific options or parameters. Check the documentation for any additional configuration that might be needed. 4. Error Logs: Check the console or application logs for any error messages that might provide more context about why the TTS is not working. Look for network errors, authentication issues, or any other exceptions. 5. Network Access: Ensure that your application has network access to OpenAI's servers. Firewalls or network restrictions might prevent the application from reaching the TTS service. 6. Test with Minimal Example: Create a minimal script to test the OpenAI TTS functionality independently. This can help isolate the issue from the rest of your application. Here's a minimal example to test OpenAI TTS:
Copy code
from livekit.plugins import openai

def test_openai_tts():
    try:
        tts = openai.TTS()
        # Assuming you have a method to synthesize text
        audio = tts.synthesize("Hello, this is a test.")
        print("TTS synthesis successful.")
    except Exception as e:
        print("Error with OpenAI TTS:", e)

test_openai_tts()
If you continue to experience issues, please provide any error messages or logs for further assistance. Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1730293289.619029 - https://docs.livekit.io/agents-js/classes/plugins_agents_plugin_openai.TTS.html πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 can you suggest a way to send data to my ai-agent from my backend so it can asked question properly
g
To send data to your AI agent from your backend, you can use LiveKit's data channels or participant metadata. Here are two approaches you can consider: 1. Data Channels: Use LiveKit's data channels to send custom data packets to your AI agent. This method allows you to send real-time data to the agent, which can then process it and ask questions accordingly. - Backend: Use the LiveKit server SDK to publish data to the data channel. - Agent: Listen for data packets on the data channel and process them to adjust the agent's behavior. Example (Backend):
Copy code
const { RoomServiceClient } = require('livekit-server-sdk');

   const client = new RoomServiceClient(LIVEKIT_URL, API_KEY, API_SECRET);
   client.sendData(roomName, Buffer.from('your data'), { kind: 'RELIABLE' });
Example (Agent):
Copy code
@ctx.room.on("data_received")
   def on_data_received(data: rtc.DataPacket):
       if data.topic == "your-topic":
           message = data.data.decode("utf-8")
           # Process the message
2. Participant Metadata: Use participant metadata to store configuration or context information that the agent can use to ask questions. - Backend: Set participant metadata when creating or updating a participant. - Agent: Access the metadata to adjust the agent's behavior. Example (Backend):
Copy code
const { AccessToken } = require('livekit-server-sdk');

   const token = new AccessToken(API_KEY, API_SECRET, {
     identity: 'participant-id',
     metadata: JSON.stringify({ questionContext: 'your context' }),
   });
Example (Agent):
Copy code
participant = ctx.room.get_participant('participant-id')
   metadata = json.loads(participant.metadata)
   # Use metadata to adjust agent behavior
Would you like more detailed instructions on either of these methods? Sources: - https://livekit-users.slack.com/archives/C07FVFGAUKX/p1729110526.781199 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729541882.418149 πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 useMaybeRoomContext is not givine me room name when I did room?.name but when I console room i see in room.roomInfo.name do have name
g
It seems like you're encountering an issue where
useMaybeRoomContext
is not directly providing the room name when accessing
room?.name
. Instead, the room name is nested within
room.roomInfo.name
. Here's how you can access the room name correctly: 1. Access the Room Info: Since the room name is nested within
room.roomInfo.name
, you should access it through the
roomInfo
property. 2. Update Your Code: Modify your code to access the room name from
room.roomInfo.name
. Here's an example of how you can do this:
Copy code
import { useMaybeRoomContext } from "@livekit/components-react";

function MyComponent() {
  const room = useMaybeRoomContext();

  // Access the room name from roomInfo
  const roomName = room?.roomInfo?.name;

  console.log("Room Name:", roomName);

  return (
    <div>
      <h1>Room Name: {roomName}</h1>
    </div>
  );
}

export default MyComponent;
Key Points: - Optional Chaining: Use optional chaining (
?.
) to safely access nested properties, which prevents errors if any part of the chain is
undefined
. - Console Logging: Use
console.log
to verify the structure of the
room
object and ensure you're accessing the correct property. If you have further questions or need additional help, feel free to ask! πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 no I can't room.roomInfo.name as roomInfo is a private property of class
g
If
roomInfo
is a private property and not directly accessible, you may need to use a method or a public property provided by the LiveKit SDK to access the room name. Here are a few approaches you can consider: 1. Check for Public Methods: Look for any public methods or properties in the LiveKit SDK that provide access to the room name. Sometimes, SDKs provide getter methods for private properties. 2. Use Hooks or Contexts: If you're using React, ensure that you're using the correct hooks or contexts provided by LiveKit to access room information. The
useRoomContext
hook might provide more direct access to the room details. 3. Access Room Name via Events: If the room name is available during certain events, such as when a participant joins, you can capture it during those events. Here's a general approach using a hypothetical public method or property:
Copy code
import { useRoomContext } from "@livekit/components-react";

function MyComponent() {
  const room = useRoomContext();

  // Hypothetical method to get room name
  const roomName = room?.getName(); // Replace with actual method if available

  console.log("Room Name:", roomName);

  return (
    <div>
      <h1>Room Name: {roomName}</h1>
    </div>
  );
}

export default MyComponent;
Key Points: - SDK Documentation: Refer to the LiveKit SDK documentation to find the correct method or property to access the room name. - Community Support: If the documentation is unclear, consider reaching out to the LiveKit community or support channels for guidance. If you have further questions or need additional help, feel free to ask! Sources: - https://docs.livekit.io/reference/components/react/hook/useroominfo/ - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727349224.407849 πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
s
@millions-hair-24956 I am still getting empty name and metadata but why?
g
If you're encountering issues with empty room names and metadata, here are some potential reasons and troubleshooting steps: 1. Timing Issues: The room name and metadata might not be available immediately upon connection. Ensure that you are accessing these properties after the room has fully connected and the context is populated. 2. Accessing After Connection: Make sure you are accessing the room's properties after the
connect
method has been called and completed. The room's properties are typically populated after a successful connection. 3. Check Room Creation: Verify that the room is being created with the correct name and metadata. If you're using a server-side API to create rooms, ensure that the metadata is being set correctly during room creation. 4. Event Listeners: Use event listeners to detect when the room's metadata changes. This can help ensure you're accessing the data at the right time. 5. Debugging: Add console logs to check the state of the room object at different points in your code to ensure that the room is connected and the properties are populated. Here's an example of how you might set up an event listener to check for metadata changes:
Copy code
import { useEffect } from 'react';
import { useRoomContext } from '@livekit/components-react';

function MyComponent() {
  const room = useRoomContext();

  useEffect(() => {
    if (!room) return;

    const handleMetadataChange = () => {
      console.log('Room Name:', room.name);
      console.log('Room Metadata:', room.metadata);
    };

    room.on('metadataChanged', handleMetadataChange);

    // Check initial values
    handleMetadataChange();

    return () => {
      room.off('metadataChanged', handleMetadataChange);
    };
  }, [room]);

  return (
    <div>
      <h1>Room Name: {room?.name}</h1>
      <p>Room Metadata: {room?.metadata}</p>
    </div>
  );
}

export default MyComponent;
Key Points: - Ensure Connection: Verify that the room is fully connected before accessing properties. - Event Listeners: Use event listeners to detect changes in metadata. - Debugging: Use console logs to debug and verify the state of the room object. If you have further questions or need additional help, feel free to ask! Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727349224.407849 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729741807.087599 πŸ‘ 1 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @millions-hair-24956 in the thread for followups.
s
@millions-hair-24956 what is rtc and how to get it here is sample code
Copy code
@ctx.room.on("data_received")
    def on_data_received(data: rtc.DataPacket):
        print("DEBUG: data received")
        print(data)
        if data.topic == "questions":
            message = data.data.decode("utf-8")
            # Process the message
g
In the context of LiveKit,
rtc
typically refers to the Real-Time Communication (RTC) module or library that handles real-time audio, video, and data communication. The
rtc
module provides classes and methods to manage these communications, such as
DataPacket
, which is used to handle data messages sent between participants in a room. How to Get `rtc`: 1. Import the RTC Module: Ensure that you have imported the
rtc
module from the LiveKit SDK in your Python script. This module contains the
DataPacket
class and other RTC-related functionalities. 2. Install LiveKit SDK: Make sure you have the LiveKit Python SDK installed in your environment. You can install it using pip if it's not already installed:
Copy code
pip install livekit
3. Import Statement: Add the appropriate import statement at the top of your script to access the
rtc
module:
Copy code
from livekit import rtc
Example Usage: Here's how you can use the
rtc
module in your code:
Copy code
from livekit import rtc

@ctx.room.on("data_received")
def on_data_received(data: rtc.DataPacket):
    print("DEBUG: data received")
    print(data)
    if data.topic == "questions":
        message = data.data.decode("utf-8")
        # Process the message
Key Points: - DataPacket: The
DataPacket
class is used to handle data messages. It contains properties like
data
and
topic
to help you process incoming messages. - Event Listener: The
on("data_received")
event listener is used to handle incoming data packets in real-time. If you have further questions or need additional help, feel free to ask! Sources: - https://livekit-users.slack.com/archives/C07FVFGAUKX/p1733152587.089089 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1731565699.055979 πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.