bumpy-judge-84325
08/14/2025, 1:36 PMtall-airline-16756
08/14/2025, 1:36 PMtall-airline-16756
08/14/2025, 1:37 PMstt_node
Method: In your custom Agent class, tweak the stt_node
to collect and save audio frames to disk. Think of it like gathering audio snippets in a buffer as they come in.
• Implement Audio Saving Logic: Add a handy method that takes those buffered audio frames and saves them to a file. For example, you could save the audio as a WAV file using Python’s wave
module.
• Trigger Saving on User Turn Completion: Use the on_user_turn_completed
hook to save the audio right after the user finishes speaking. Don’t forget to clear the buffer afterward so it’s ready for the next turn!
This way, you can easily capture and save audio clips while handling user input. If you want me to walk you through some code examples or need a hand with anything else, just give me a shout!bumpy-judge-84325
08/14/2025, 1:37 PMtall-airline-16756
08/14/2025, 1:38 PMimport wave
from datetime import datetime
from typing import AsyncIterable, Optional
from livekit import Agent, rtc, stt, ModelSettings, ChatContext, ChatMessage
class AudioSavingAgent(Agent):
def __init__(self):
super().__init__(instructions="Your agent instructions here")
self.audio_buffer = []
self.buffer_size = 0
async def stt_node(self, audio: AsyncIterable[rtc.AudioFrame], model_settings: ModelSettings) -> Optional[AsyncIterable[stt.SpeechEvent]]:
async def audio_with_buffering():
async for frame in audio:
self.audio_buffer.append(frame)
self.buffer_size += len(frame.data)
# Keep the buffer size manageable
if self.buffer_size > 100000:
self.audio_buffer.pop(0)
yield frame
events = super().stt_node(audio_with_buffering(), model_settings)
if events is not None:
async for event in events:
yield event
async def on_user_turn_completed(self, turn_ctx: ChatContext, new_message: ChatMessage) -> None:
# Save audio after the user speaks
await self.save_audio_to_disk()
# Clear the buffer
self.audio_buffer.clear()
# Continue with default processing
await super().on_user_turn_completed(turn_ctx, new_message)
async def save_audio_to_disk(self):
if not self.audio_buffer:
return
# Generate a unique filename
filename = f"audio_recording_{datetime.now().strftime('%Y%m%d_%H%M%S')}.wav"
# Create a WAV file
with wave.open(filename, 'wb') as wav_file:
wav_file.setnchannels(1) # Mono audio
wav_file.setsampwidth(2) # 2 bytes per sample
wav_file.setframerate(48000) # 48 kHz sample rate
for frame in self.audio_buffer:
wav_file.writeframes(frame.data.tobytes())
print(f"Audio saved to {filename}")
### Key Features:
• Audio Buffering: Collects audio frames during user turns.
• WAV File Saving: Saves audio in WAV format with a unique filename.
• Buffer Management: Ensures the buffer size remains manageable.
• Integration: Hooks into the on_user_turn_completed
method to save audio after each user turn.
Feel free to tweak the filename format, buffer size, or audio settings to fit your needs. And if you have any questions or want a hand with anything else, just let me know—I’m here to help!