curved-forest-97958
02/26/2025, 2:51 AMgray-monkey-83525
02/26/2025, 2:51 AMstream
method of the TTS instance to push your text input and receive audio data. This method returns a SynthesizeStream
that you can iterate over to get the audio frames.
3. Publish Audio to Participant: Use the rtc.LocalAudioTrack
to create an audio track and publish it to the room. You can capture frames from the TTS stream and send them to the participant.
Here's a basic example using OpenAI TTS:
import { TTS } from '@livekit/agents-plugin-openai';
import { LocalAudioTrack, AudioSource } from '@livekit/rtc-node';
const tts = new TTS({ apiKey: 'YOUR_OPENAI_API_KEY', model: 'gpt-4o-mini' });
const ttsStream = tts.stream();
async function sendAudio(text) {
const audioSource = new AudioSource(44100, 1);
const track = LocalAudioTrack.createAudioTrack('agent-audio', audioSource);
// Push text to TTS stream
ttsStream.push(text);
ttsStream.endInput();
// Capture and publish audio frames
for await (const audioFrame of ttsStream) {
await audioSource.captureFrame(audioFrame);
}
}
// Example usage
sendAudio("Hello, participant!");
Would you like more details on any specific part of this process?
Sources:
- LiveKit TTS Plugin Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.curved-forest-97958
02/26/2025, 3:06 AMgray-monkey-83525
02/26/2025, 3:06 AMaudioFrame
object you're trying to pass to captureFrame
is not in the expected AudioFrame
format. The SynthesizedAudio
type from the TTS stream needs to be converted to an AudioFrame
before it can be used with captureFrame
.
Here's how you can convert SynthesizedAudio
to `AudioFrame`:
1. Extract Audio Data: Extract the necessary audio data from the SynthesizedAudio
object.
2. Create AudioFrame: Construct an AudioFrame
using the extracted data.
Here's an example of how you might do this:
import { AudioFrame } from '@livekit/rtc-node';
async function sendAudio(text) {
const audioSource = new AudioSource(44100, 1);
const track = LocalAudioTrack.createAudioTrack('agent-audio', audioSource);
ttsStream.push(text);
ttsStream.endInput();
for await (const synthesizedAudio of ttsStream) {
const audioFrame = new AudioFrame({
data: synthesizedAudio.frame.data, // Assuming frame contains raw audio data
sampleRate: 44100, // Set the correct sample rate
channels: 1, // Set the correct number of channels
samplesPerChannel: synthesizedAudio.frame.samplesPerChannel, // Set the correct samples per channel
});
await audioSource.captureFrame(audioFrame);
}
}
Would you like more details on any specific part of this process?
Sources:
- LiveKit SynthesizedAudio Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.