fierce-analyst-82328
09/14/2025, 7:51 AMtall-airline-16756
09/14/2025, 7:51 AMtall-airline-16756
09/14/2025, 7:51 AMllm_node
isn't passing text to the transcription_node
in LiveKit. There could be a few reasons for this, so let’s walk through some important points together:
• First, please make sure the agent is set up correctly with transcription enabled.
• Also, double-check that the transcription_node
hasn’t been customized in a way that might filter or block the content.
• It’s important to confirm that RoomOutputOptions.transcription_enabled
is set to True
.
Just to clarify, the llm_node
generates text output which should be captured into a text channel and then forwarded to the transcription_node
. By default, the transcription_node
simply yields the text it receives. If you’ve made any customizations to it, please ensure it still yields the processed text chunks properly to keep the flow going smoothly.
If you’d like, feel free to share your specific setup or any customizations you’ve made, and I’d be more than happy to guide you through the next steps. You’re not alone in this!
Sources: Docs::Agents | livekit-agents/livekit/agents/voice/io.py | agents/src/voice/agent_activity.ts | Docs::Agentsfierce-analyst-82328
09/14/2025, 7:54 AMtall-airline-16756
09/14/2025, 7:55 AMRoomOutputOptions(transcription_enabled=True)
set and have adjusted the llm_node
for structured output. Let’s walk through some friendly steps to help make sure the structured output flows smoothly to the `transcription_node`:
1. First, double-check that your transcription_node
is properly overridden to handle structured output. A helpful approach is using a function like process_structured_output
to parse the JSON and pull out the key fields (like response
).
2. Next, make sure your LLM is actually returning output in the JSON format you expect. If it sends plain text or something different, it might get ignored. It’s a good idea to log the raw output from llm_node
so you can see exactly what’s coming through.
3. Also, confirm that transcriptionEnabled
is set to True
and that you have a valid TextOutput
sink connected to your session output.
4. Finally, adding some logging inside both llm_node
and transcription_node
can really help trace how data flows and verify that transcription_node
is receiving input and producing a non-empty output stream.
If you’d like, I’m here to help you dig deeper! Feel free to share the specific structure of your LLM output and any customizations you’ve made in your transcription_node
, and we can troubleshoot together.
Sources: livekit-agents/livekit/agents/voice/generation.py | agents/src/voice/agent_activity.tsfierce-analyst-82328
09/14/2025, 7:57 AMtall-airline-16756
09/14/2025, 7:57 AMtts_node
and transcription_node
in LiveKit work, as they handle inputs a bit differently because they serve different purposes:
• TTS Node: This one works with plain text streams (AsyncIterable[str]
) to create speech. It’s focused on turning text into audio frames and doesn’t directly manage structured or timestamped data. If you want, you can customize it to filter structured output and extract just the parts that can be synthesized.
• Transcription Node: This node deals with the final transcription output, which might be plain strings or TimedString
objects (AsyncIterable[str | TimedString]
). It’s great for syncing text with speech playback, especially when TTS-aligned transcription is turned on. Plus, it lets you do some post-processing, like cleaning up or formatting the text.
Both nodes can handle structured output with helpers like process_structured_output
, but the transcription_node
stands out because it supports timing data for synchronization. If you have any questions or want help customizing these nodes, just let me know—I’m here for you!
Sources: Docs::Agents | Text and transcriptions | LiveKit Docs | Text and transcriptions | LiveKit Docs | livekit-agents/livekit/agents/voice/agent_activity.py | Docs::Agentsfierce-analyst-82328
09/14/2025, 7:59 AMyield
?tall-airline-16756
09/14/2025, 8:00 AMllm_node
doesn’t send its output directly to both the tts_node
and transcription_node
using yield
. Instead, LiveKit uses a clever stream-splitting method with tee
to duplicate the output stream from the `llm_node`:
1. The llm_node
generates an asynchronous text stream using yield
.
2. This stream is then split into two separate streams using `tee`:
◦ One stream goes to the tts_node
for speech synthesis.
◦ The other stream goes to the transcription_node
to finalize and forward transcriptions.
This setup allows both nodes to work with the same content simultaneously without repeating LLM calls or getting in each other’s way. Also, if TTS-aligned transcription is turned on, the transcription_node
might get timed transcription data from the TTS engine instead of the original text stream.
Please don’t hesitate to ask if you’d like me to walk you through any part of this flow or if there’s anything else I can help with!
Sources: livekit-agents/livekit/agents/voice/agent_activity.py | agents/src/voice/agent.ts | Docs::Agents | Pipeline nodes and hooks | LiveKit Docsfierce-analyst-82328
09/14/2025, 8:01 AMtall-airline-16756
09/14/2025, 8:02 AMtts_node
is receiving input, but the transcription_node
isn’t. Let’s walk through some possible reasons and helpful steps to troubleshoot this together:
1. Stream Splitting Issue: Since the LLM output is split into two streams using tee
, if the transcription stream gets closed or canceled, it won’t receive any data. It’s a good idea to double-check that both branches of the split are being properly consumed.
2. Custom transcription_node
Logic: If you’ve customized the transcription_node
, please make sure it’s yielding output as expected. Adding some logging can really help confirm that input is being received and processed correctly.
3. TTS-Aligned Transcription: When use_tts_aligned_transcript=True
is enabled, the transcription input might rely on TTS-aligned data. Please verify that your TTS engine supports this feature, or try disabling it temporarily to see if that makes a difference.
4. Transcription Disabled: It’s worth confirming that RoomOutputOptions(transcription_enabled=True)
is set, and that your frontend is properly listening to the lk.transcription
topic.
5. Interruption or Cancellation: If the agent’s speech gets interrupted, the transcription stream might be canceled. Checking your logs for any interruption events could provide some clues.
To help narrow down the issue, I recommend enabling logging in your transcription_node
and trying out the default implementation as a test. Please feel free to reach out if you want to go over these steps together or need any further assistance—I’m here to help!
Sources: examples/voice_agents/structured_output.py | Docs::Agents | Text and transcriptions | LiveKit Docs