breezy-fish-3124
08/20/2025, 12:59 AMsession = AgentSession(
llm=realtime.RealtimeModel(
voice="alloy",
turn_detection=TurnDetection(
type="semantic_vad",
eagerness="low",
interrupt_response=False,
),
modalities=["text"],
),
tts=openai.TTS(voice="alloy"),
stt=speechmatics.STT(
transcription_config=speechmatics.types.TranscriptionConfig(
language="en",
operating_point="enhanced",
enable_partials=True,
max_delay=0.7,
diarization="speaker",
speaker_diarization_config={"max_speakers": 2},
)
),
)
And I'm trying to read that text inside the on_user_speech_transcribed
method that I register in my agent code with
async def on_enter(self) -> None:
<http://logger.info|logger.info>("[MitraCouplesAgent] Entering room")
# 🎯 Hybrid transcript handling approach (Scenario 4)
# Primary: Handle user speech with participant info when available
self.session.on("user_input_transcribed", self.on_user_speech_transcribed)
But I'm not seeing the diarization happen or any speaker tags
def on_user_speech_transcribed(self, event: UserInputTranscribedEvent):
"""🎤 Primary handler for user speech with participant identification"""
try:
if not event.is_final:
logger.debug("🔄 Received interim transcript, skipping...")
return
participant = self.participants.determine_participant(event)
<http://logger.info|logger.info>(f"🎤 User speech transcribed from participant: {participant.identity}")
<http://logger.info|logger.info>(f"🔑 Auth token available: {participant.ctx is not None}")
<http://logger.info|logger.info>(f"📝 Transcript: {event.transcript}")
<http://logger.info|logger.info>(f"user_input_transcribed: \"[{event.speaker_id}]: {event.transcript}\"")
The log for user_input_transcribed returns "None" for speaker_id. Why?tall-airline-16756
08/20/2025, 12:59 AMtall-airline-16756
08/20/2025, 1:00 AMspeaker_id
return None
in the user_input_transcribed
event, it’s probably because the Speechmatics STT setup isn’t quite aligned—especially around the diarization settings, which might not be properly applied or could be getting overridden.
To get speaker diarization working smoothly, try using the modern constructor parameters for the STT configuration like this:
stt = speechmatics.STT(
enable_diarization=True, # This part is super important
diarization_sensitivity=0.5,
speaker_active_format="<{speaker_id}>{text}",
# ... other settings
)
Also, a quick heads-up: avoid mixing the older transcription_config
parameter with the newer enable_diarization
flag, since that can quietly disable diarization without any warnings. If you want, I’m here to help you troubleshoot further—just let me know!
Sources: livekit-agents/livekit/agents/stt/stt.py | Speechmatics STT integration guide | LiveKit Docs | Events and error handling | LiveKit Docs