lemon-elephant-62047
08/15/2025, 4:48 PMtall-airline-16756
08/15/2025, 4:48 PMtall-airline-16756
08/15/2025, 4:49 PMTextAudioSynchronizer
class to keep your audio and text in sync, and implement the TranscriptionForwarder
interface to handle the transcription data smoothly.
2. Initialize the Transcription Service: Here’s a quick snippet to get your STT service up and running:
const sttStream = new STT({ sampleRate: 48000 }).stream();
3. Process Audio and Text: Pipe your audio stream into the STT service, and use the pushAudio
and pushText
methods from TextAudioSynchronizer
to feed in the data.
4. Handle Transcription Events: Keep an ear out for transcription events from the STT service and let TextAudioSynchronizer
handle updating the text in real-time.
5. Publish Transcriptions: Finally, implement the TranscriptionForwarder
to share those transcription segments with everyone in the room.
Here’s an example code snippet to give you a clearer picture:
const { STT } = require('@livekit/agents-plugin-deepgram');
const { defineAgent, cli } = require('@livekit/agents');
const { Track, RoomEvent, TrackKind } = require('@livekit/rtc-node');
const { AudioStream } = require('@livekit/rtc-node');
export default defineAgent({
entry: async (ctx) => {
await ctx.connect();
console.log('starting STT example agent');
const transcribeTrack = async (track) => {
const audioStream = new AudioStream(track);
const sttStream = new STT({ sampleRate: 48000 }).stream();
const sendTask = async () => {
for await (const event of audioStream) {
sttStream.pushFrame(event);
}
};
const recvTask = async () => {
for await (const event of sttStream) {
if (event.type === STT.SpeechEventType.FINAL_TRANSCRIPT) {
console.log(event.alternatives[0].text);
}
}
};
await Promise.all([sendTask(), recvTask()]);
};
ctx.room.on(RoomEvent.TrackSubscribed, async (track) => {
if (track.kind === TrackKind.KIND_AUDIO) {
transcribeTrack(track);
}
});
},
});
cli.runApp(new WorkerOptions({ agent: fileURLToPath(import.meta.url) }));
And if you’re working on the frontend, React hooks like useTrackTranscription
are super handy for listening to transcription events and showing them on screen:
const { useTrackTranscription } = require('@livekit/components-react');
function TranscriptionDisplay() {
const { transcriptions } = useTrackTranscription();
return (
{Object.values(transcriptions).map((segment) => (
{segment.text}
))}
);
}
Just make sure you’ve got all the right dependencies installed and your environment variables set up with your API keys and service URLs. If you want to dive deeper, the LiveKit Documentation is a fantastic resource. Happy coding!lemon-elephant-62047
08/15/2025, 5:01 PMtall-airline-16756
08/15/2025, 5:01 PMlemon-elephant-62047
08/15/2025, 5:08 PMtall-airline-16756
08/15/2025, 5:08 PM// Import necessary modules
import { type JobContext, WorkerOptions, cli, defineAgent, stt } from '@livekit/agents';
import { STT } from '@livekit/agents-plugin-deepgram';
import type { Track } from '@livekit/rtc-node';
import { AudioStream, RoomEvent, TrackKind } from '@livekit/rtc-node';
import { fileURLToPath } from 'node:url';
export default defineAgent({
entry: async (ctx: JobContext) => {
await ctx.connect();
console.log('starting STT example agent');
const transcribeTrack = async (track: Track) => {
const audioStream = new AudioStream(track);
const sttStream = new STT({ sampleRate: 48000 }).stream();
const sendTask = async () => {
for await (const event of audioStream) {
sttStream.pushFrame(event);
}
};
const recvTask = async () => {
for await (const event of sttStream) {
if (event.type === stt.SpeechEventType.FINAL_TRANSCRIPT) {
console.log(event.alternatives![0].text);
}
}
};
Promise.all([sendTask(), recvTask()]);
};
ctx.room.on(RoomEvent.TrackSubscribed, async (track: Track) => {
if (track.kind === TrackKind.KIND_AUDIO) {
transcribeTrack(track);
}
});
},
});
cli.runApp(new WorkerOptions({ agent: fileURLToPath(import.meta.url) }));
This snippet listens for audio tracks, processes them with the Deepgram STT plugin, and logs the transcriptions as they happen. Just make sure you have all the right dependencies installed and a LiveKit server up and running. If you hit any bumps or want to chat more about it, I’m here to help!lemon-elephant-62047
08/15/2025, 5:10 PMtall-airline-16756
08/15/2025, 5:10 PM// Import necessary modules
import { type JobContext, WorkerOptions, defineAgent, stt } from '@livekit/agents';
import { STT } from '@livekit/agents-plugin-deepgram';
import type { Track } from '@livekit/rtc-node';
import { AudioStream, RoomEvent, TrackKind } from '@livekit/rtc-node';
import { fileURLToPath } from 'node:url';
export default defineAgent({
entry: async (ctx: JobContext) => {
await ctx.connect();
console.log('starting STT example agent');
const transcribeTrack = async (track: Track) => {
const audioStream = new AudioStream(track);
const sttStream = new STT({ sampleRate: 48000 }).stream();
const sendTask = async () => {
for await (const event of audioStream) {
sttStream.pushFrame(event);
}
};
const recvTask = async () => {
for await (const event of sttStream) {
if (event.type === stt.SpeechEventType.FINAL_TRANSCRIPT) {
console.log(event.alternatives![0].text);
}
}
};
Promise.all([sendTask(), recvTask()]);
};
ctx.room.on(RoomEvent.TrackSubscribed, async (track: Track) => {
if (track.kind === TrackKind.KIND_AUDIO) {
transcribeTrack(track);
}
});
},
});
cli.runApp(new WorkerOptions({ agent: fileURLToPath(import.meta.url) }));
This code connects to the LiveKit Cloud server, listens for audio tracks, and uses the Deepgram STT plugin to process them. You’ll see the transcription results logged right in the console.
Just make sure you’ve set up your LiveKit Cloud server credentials and environment variables for authentication. If you hit any bumps or want to chat more about it, I’m here to help!tall-airline-16756
08/15/2025, 5:19 PMRoomEvent.TranscriptionReceived
, here’s an updated way to modify your code:
import { useEffect, useState } from "react";
import {
useMaybeRoomContext,
TranscriptionSegment,
RoomEvent,
TrackPublication,
Participant,
} from "livekit-client";
import { useMaybeRoomContext } from "@livekit/components-react";
export default function TranscriptionComponent() {
const room = useMaybeRoomContext();
const [transcriptions, setTranscriptions] = useState<
{ [id: string]: TranscriptionSegment }
>({});
useEffect(() => {
if (!room) {
return;
}
const updateTranscriptions = (
segments: TranscriptionSegment[],
participant?: Participant,
publication?: TrackPublication,
) => {
setTranscriptions((prev) => {
const newTranscriptions = { ...prev };
for (const segment of segments) {
newTranscriptions[segment.id] = segment;
}
return newTranscriptions;
});
};
room.on(RoomEvent.TranscriptionReceived, updateTranscriptions);
return () => {
room.off(RoomEvent.TranscriptionReceived, updateTranscriptions);
};
}, [room]);
return (
);
}
Key Changes:
• We added proper state management with useState
to keep track of transcription segments.
• The event listener now updates the state whenever new transcriptions come in.
• Transcriptions are nicely sorted and displayed in a list by the time they were received.
Just make sure your room is set up and connected before using this component. If you hit any snags or want to chat more about it, I’m here to help!lemon-elephant-62047
08/15/2025, 5:58 PMtall-airline-16756
08/15/2025, 5:58 PMimport { useEffect, useState } from "react";
import {
TranscriptionSegment,
Participant,
TrackPublication,
RoomEvent,
} from "livekit-client";
import { useMaybeRoomContext } from "@livekit/components-react";
export default function LiveCaptions() {
const room = useMaybeRoomContext();
const [transcriptions, setTranscriptions] = useState<{
[id: string]: TranscriptionSegment;
}>({});
useEffect(() => {
if (!room) {
return;
}
const updateTranscriptions = (
segments: TranscriptionSegment[],
participant?: Participant,
publication?: TrackPublication
) => {
setTranscriptions((prev) => {
const newTranscriptions = { ...prev };
for (const segment of segments) {
newTranscriptions[segment.id] = segment;
}
return newTranscriptions;
});
};
room.on(RoomEvent.TranscriptionReceived, updateTranscriptions);
return () => {
room.off(RoomEvent.TranscriptionReceived, updateTranscriptions);
};
}, [room]);
return (
);
}
How it works:
• It listens for RoomEvent.TranscriptionReceived
to grab transcriptions as they happen.
• Keeps those transcription segments in state and updates them on the fly.
• Shows the transcriptions in a neat, time-sorted list.
Just make sure your LiveKit server is set up with transcription agents to get this running smoothly. If you hit any bumps or want to chat more about it, I’m here to help!lemon-elephant-62047
08/15/2025, 5:59 PMtall-airline-16756
08/15/2025, 6:00 PMlemon-elephant-62047
08/15/2025, 6:01 PMtall-airline-16756
08/15/2025, 6:01 PMimport { type JobContext, WorkerOptions, cli, defineAgent } from '@livekit/agents';
import { STT } from '@livekit/agents-plugin-deepgram';
import type { Track } from '@livekit/rtc-node';
import { AudioStream, RoomEvent, TrackKind } from '@livekit/rtc-node';
export default defineAgent({
entry: async (ctx: JobContext) => {
await ctx.connect();
console.log('starting STT example agent');
const transcribeTrack = async (track: Track) => {
const audioStream = new AudioStream(track);
const sttStream = new STT({ sampleRate: 48000 }).stream();
const sendTask = async () => {
for await (const event of audioStream) {
sttStream.pushFrame(event);
}
};
const recvTask = async () => {
for await (const event of sttStream) {
if (event.type === STT.SpeechEventType.FINAL_TRANSCRIPT) {
console.log(event.alternatives![0].text);
}
}
};
Promise.all([sendTask(), recvTask()]);
};
ctx.room.on(RoomEvent.TrackSubscribed, async (track: Track) => {
if (track.kind === TrackKind.KIND_AUDIO) {
transcribeTrack(track);
}
});
},
});
cli.runApp(new WorkerOptions({
agent: fileURLToPath(import.meta.url)
}));
How it works:
• Connects to a LiveKit room and listens for audio tracks.
• Uses Deepgram STT to transcribe the audio in real-time.
• Prints the transcription results right to your console.
Setup Instructions:
1. Install the necessary packages: @livekit/agents
, @livekit/agents-plugin-deepgram
, and @livekit/rtc-node
.
2. Make sure your Deepgram API credentials are set up correctly.
3. Create a LiveKit room and start the agent.
If you have any questions or want a hand getting this up and running, just let me know—I’m here to help!
Sources: agents/src/pipeline/pipeline_agent.tslemon-elephant-62047
08/16/2025, 5:50 AMtall-airline-16756
08/16/2025, 5:51 AMpackage.json
file to get you started with installing the necessary packages:
{
"name": "my-livekit-agent",
"version": "1.0.0",
"type": "module",
"scripts": {
"build": "tsc",
"start": "node dist/index.js start",
"dev": "tsx src/index.ts dev"
},
"dependencies": {
"@livekit/agents": "^0.7.9",
"@livekit/agents-plugin-deepgram": "^0.5.6",
"@livekit/rtc-node": "^0.13.18"
},
"devDependencies": {
"@types/node": "^22.5.5",
"tsx": "^4.19.2",
"typescript": "^5.0.0"
}
}
Installation Instructions:
1. Pop the above content into a package.json
file.
2. Run npm install
to get all the dependencies set up.
If you hit any snags or want to chat more about this, just let me know—I’m here to help!
Sources: pnpm-lock.yaml | packages/basics/package.jsonlemon-elephant-62047
08/16/2025, 5:52 AMserver@1.0.0 start:agent
node agents/transcriptionAgent.jsUsage: agents [options] [command] LiveKit Agents CLI Options: -V, --version output the version number --log-level <level> Set the logging level (choices: "trace", "debug", "info", "warn", "error", "fatal", default: "info", env: LOG_LEVEL) --url <string> LiveKit server or Cloud project websocket URL (env: LIVEKIT_URL) --api-key <string> LiveKit server or Cloud project's API key (env: LIVEKIT_API_KEY) --api-secret <string> LiveKit server or Cloud project's API secret (env: LIVEKIT_API_SECRET) -h, --help display help for command Commands: start Start the worker in production mode dev [options] Start the worker in development mode connect [options] Connect to a specific room got this when tried to start the agent
tall-airline-16756
08/16/2025, 5:52 AMLIVEKIT_URL=<wss://your-livekit-server-url> \
LIVEKIT_API_KEY=your-api-key \
LIVEKIT_API_SECRET=your-api-secret \
npm run start:agent
Another option is to put these variables in a .env
file and load them with a package like dotenv
. If you want, I’m happy to help you set this up—just let me know!lemon-elephant-62047
08/19/2025, 1:29 PMtall-airline-16756
08/19/2025, 1:31 PMlivekit-client
): 2.15.5
• LiveKit Server SDK (livekit-server-sdk
): 2.13.2
• LiveKit Agents Framework (@livekit/agents
): 0.7.0
If you want to double-check you’re using the most recent versions, you can run these commands on npm:
npm view livekit-client version
npm view livekit-server-sdk version
npm view @livekit/agents version
Feel free to reach out if you need any more help with these packages—I’m here for you!
Sources: package-lock.json | package-lock.json | frontend/package-lock.json | pnpm-lock.yaml | example/yarn.locklemon-elephant-62047
08/19/2025, 1:32 PM