Please confirm whether the `audio` iterable passed...
# ask-ai
f
Please confirm whether the
audio
iterable passed into
stt_node(self, audio, model_settings)
is already post-BVCTelephony and pre-AudioProcessingModule. If not, show the canonical way to access frames at that point (post-BVC/pre-APM) in Python agents.
t
Thanks for reaching out, Muthana. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
The audio passed to stt_node(self, audio, model_settings) isn’t automatically post-processed by BVCTelephony or AudioProcessingModule. So, the best way to work with audio frames at this stage is to handle the audio parameter directly inside the stt_node method. Here’s a quick example:
Copy code
async def stt_node(self, audio: AsyncIterable[rtc.AudioFrame], model_settings: ModelSettings) -> AsyncIterable[stt.SpeechEvent | str]:
    # This is where you can access audio frames before STT processing
    async for frame in audio:
        # Feel free to process each frame as needed
        yield frame
If you want to add your own preprocessing, you can simply override the stt_node method in your agent class. Check out this example:
Copy code
class CustomAgent(Agent):
    async def stt_node(self, audio: AsyncIterable[rtc.AudioFrame], model_settings: ModelSettings) -> AsyncIterable[stt.SpeechEvent]:
        async def processed_audio():
            async for frame in audio:
                # Here’s where you can apply custom processing to each audio frame
                processed_frame = self._process_frame(frame)
                yield processed_frame

        return await super().stt_node(processed_audio(), model_settings)

    def _process_frame(self, frame: rtc.AudioFrame) -> rtc.AudioFrame:
        # Put your custom frame processing logic here
        # Return either the original or the modified frame
        return frame
This way, you get to add any preprocessing steps you need while still making the most of the STT capabilities LiveKit offers. If you have any questions or want to chat more about it, I’m here to help! Sources: Voice pipeline nodes | LiveKit Docs | examples/voice_agents/speedup_output_audio.py