few-soccer-7112
08/16/2025, 12:30 AMrtc.AudioFrame
? (e.g., interleaved int16, samples_per_channel
, sample_rate
, num_channels
). Please confirm typical sample rate when SIP ingress + BVC are enabled (16 kHz vs 48 kHz), and whether telephony is downmixed to mono before arriving to the agent. Provide a short struct/fields table.tall-airline-16756
08/16/2025, 12:30 AMtall-airline-16756
08/16/2025, 12:31 AMrtc.AudioFrame
in Python Agents and its guaranteed frame format:
• Data Format: It uses 16-bit signed integers (int16
) interleaved by channel — pretty standard stuff.
• Sample Rate: Usually 48 kHz when SIP ingress and BVC are enabled, so the audio sounds crisp.
• Channels: Telephony audio typically comes downmixed to mono (1 channel) before it reaches the agent.
Struct/Fields:
• sample_rate: int - This is the sample rate in Hz (like 48000).
• num_channels: int - Number of audio channels (usually 1).
• samples_per_channel: int - How many samples per channel (for example, 480).
• data: bytes - The interleaved int16
audio data itself.
If anything’s unclear or you want to dive deeper, just let me know—I’m here to help!