calm-optician-60209
07/11/2025, 2:08 PMtall-airline-16756
07/11/2025, 2:08 PMtall-airline-16756
07/11/2025, 2:09 PMdiarize
parameter in Speech-to-Text (STT) enables speaker diarization, which is the process of identifying and distinguishing between different speakers in an audio recording. When you set diarize
to true
(or "speaker"
in some integrations), the STT system will:
• Identify different speakers in the audio
• Label each segment of the transcript with a speaker identifier (such as "S1", "S2")
• Attribute each part of the transcribed text to the detected speaker
This feature is useful for scenarios like meetings, interviews, or conversations involving more than one person, as it allows you to track who spoke each part of the transcript.
In LiveKit's Deepgram plugin, diarize
is a boolean option (diarize: true
), while for Speechmatics, it's set via diarization: "speaker"
along with options like max_speakers
. When enabled, you can also access the speaker identifier alongside each transcription event, allowing for advanced multi-speaker handling.
References:
Sources: plugins/deepgram/src/stt.ts | Speechmatics integration guide | LiveKit Docs | STTOptions | LiveKit Agents | Speechmatics STT integration guide | LiveKit Docs