llm node is not yielding text to transcription node in livek LiveKit Community #ask-ai

Join Slack

llm node is not yielding text to transcription_nod...

# ask-ai

fierce-analyst-82328

09/14/2025, 7:51 AM

llm node is not yielding text to transcription_node in livekit

tall-airline-16756

09/14/2025, 7:51 AM

Thanks for reaching out, Shashwat. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

09/14/2025, 7:51 AM

I'm here to help you with the issue where the

llm_node

isn't passing text to the

transcription_node

in LiveKit. There could be a few reasons for this, so let’s walk through some important points together: • First, please make sure the agent is set up correctly with transcription enabled. • Also, double-check that the

transcription_node

hasn’t been customized in a way that might filter or block the content. • It’s important to confirm that

RoomOutputOptions.transcription_enabled

is set to

True

. Just to clarify, the

llm_node

generates text output which should be captured into a text channel and then forwarded to the

transcription_node

. By default, the

transcription_node

simply yields the text it receives. If you’ve made any customizations to it, please ensure it still yields the processed text chunks properly to keep the flow going smoothly. If you’d like, feel free to share your specific setup or any customizations you’ve made, and I’d be more than happy to guide you through the next steps. You’re not alone in this! Sources: Docs::Agents | livekit-agents/livekit/agents/voice/io.py | agents/src/voice/agent_activity.ts | Docs::Agents

fierce-analyst-82328

09/14/2025, 7:54 AM

I have room_output_options=RoomOutputOptions(transcription_enabled=True), I also modify the llm_node to give structured output

tall-airline-16756

09/14/2025, 7:55 AM

Thanks for sharing that you have

RoomOutputOptions(transcription_enabled=True)

set and have adjusted the

llm_node

for structured output. Let’s walk through some friendly steps to help make sure the structured output flows smoothly to the `transcription_node`: 1. First, double-check that your

transcription_node

is properly overridden to handle structured output. A helpful approach is using a function like

process_structured_output

to parse the JSON and pull out the key fields (like

response

). 2. Next, make sure your LLM is actually returning output in the JSON format you expect. If it sends plain text or something different, it might get ignored. It’s a good idea to log the raw output from

llm_node

so you can see exactly what’s coming through. 3. Also, confirm that

transcriptionEnabled

is set to

True

and that you have a valid

TextOutput

sink connected to your session output. 4. Finally, adding some logging inside both

llm_node

and

transcription_node

can really help trace how data flows and verify that

transcription_node

is receiving input and producing a non-empty output stream. If you’d like, I’m here to help you dig deeper! Feel free to share the specific structure of your LLM output and any customizations you’ve made in your

transcription_node

, and we can troubleshoot together. Sources: livekit-agents/livekit/agents/voice/generation.py | agents/src/voice/agent_activity.ts

fierce-analyst-82328

09/14/2025, 7:57 AM

Does the tts_node and transcription_node receive their inputs in the same way or is there a difference?

tall-airline-16756

09/14/2025, 7:57 AM

I’d love to help you understand how the

tts_node

and

transcription_node

in LiveKit work, as they handle inputs a bit differently because they serve different purposes: • TTS Node: This one works with plain text streams (

AsyncIterable[str]

) to create speech. It’s focused on turning text into audio frames and doesn’t directly manage structured or timestamped data. If you want, you can customize it to filter structured output and extract just the parts that can be synthesized. • Transcription Node: This node deals with the final transcription output, which might be plain strings or

TimedString

objects (

AsyncIterable[str | TimedString]

). It’s great for syncing text with speech playback, especially when TTS-aligned transcription is turned on. Plus, it lets you do some post-processing, like cleaning up or formatting the text. Both nodes can handle structured output with helpers like

process_structured_output

, but the

transcription_node

stands out because it supports timing data for synchronization. If you have any questions or want help customizing these nodes, just let me know—I’m here for you! Sources: Docs::Agents | Text and transcriptions | LiveKit Docs | Text and transcriptions | LiveKit Docs | livekit-agents/livekit/agents/voice/agent_activity.py | Docs::Agents

fierce-analyst-82328

09/14/2025, 7:59 AM

But how do they receive it in the flow. does llm_node send it to both of them using

yield

tall-airline-16756

09/14/2025, 8:00 AM

Thanks for reaching out! I’d be happy to explain how this works. The

llm_node

doesn’t send its output directly to both the

tts_node

and

transcription_node

using

yield

. Instead, LiveKit uses a clever stream-splitting method with

tee

to duplicate the output stream from the `llm_node`: 1. The

llm_node

generates an asynchronous text stream using

yield

. 2. This stream is then split into two separate streams using `tee`: ◦ One stream goes to the

tts_node

for speech synthesis. ◦ The other stream goes to the

transcription_node

to finalize and forward transcriptions. This setup allows both nodes to work with the same content simultaneously without repeating LLM calls or getting in each other’s way. Also, if TTS-aligned transcription is turned on, the

transcription_node

might get timed transcription data from the TTS engine instead of the original text stream. Please don’t hesitate to ask if you’d like me to walk you through any part of this flow or if there’s anything else I can help with! Sources: livekit-agents/livekit/agents/voice/agent_activity.py | agents/src/voice/agent.ts | Docs::Agents | Pipeline nodes and hooks | LiveKit Docs

fierce-analyst-82328

09/14/2025, 8:01 AM

My issue is tts_node is receiving the text input correctly but transcription_node is not receiving anything. What can cause this?

tall-airline-16756

09/14/2025, 8:02 AM

It sounds like you’re encountering an issue where the

tts_node

is receiving input, but the

transcription_node

isn’t. Let’s walk through some possible reasons and helpful steps to troubleshoot this together: 1. Stream Splitting Issue: Since the LLM output is split into two streams using

tee

, if the transcription stream gets closed or canceled, it won’t receive any data. It’s a good idea to double-check that both branches of the split are being properly consumed. 2. Custom
transcription_node
Logic: If you’ve customized the

transcription_node

, please make sure it’s yielding output as expected. Adding some logging can really help confirm that input is being received and processed correctly. 3. TTS-Aligned Transcription: When

use_tts_aligned_transcript=True

is enabled, the transcription input might rely on TTS-aligned data. Please verify that your TTS engine supports this feature, or try disabling it temporarily to see if that makes a difference. 4. Transcription Disabled: It’s worth confirming that

RoomOutputOptions(transcription_enabled=True)

is set, and that your frontend is properly listening to the

lk.transcription

topic. 5. Interruption or Cancellation: If the agent’s speech gets interrupted, the transcription stream might be canceled. Checking your logs for any interruption events could provide some clues. To help narrow down the issue, I recommend enabling logging in your

transcription_node

and trying out the default implementation as a test. Please feel free to reach out if you want to go over these steps together or need any further assistance—I’m here to help! Sources: examples/voice_agents/structured_output.py | Docs::Agents | Text and transcriptions | LiveKit Docs

2 Views

Open in Slack

Previous Next