how to reduce chunk size in the pipeline LiveKit Community #ask-ai

Join Slack

how to reduce chunk size in the pipeline

# ask-ai

busy-monitor-72069

09/12/2025, 9:56 PM

how to reduce chunk size in the pipeline

tall-airline-16756

09/12/2025, 9:56 PM

Thanks for reaching out, Harsh. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

09/12/2025, 9:56 PM

Thanks for reaching out! Could you please help me understand a bit more about which pipeline you’re referring to? For instance, are you asking about media processing, data transfer, or maybe another specific pipeline in LiveKit? Also, if you have a particular use case or issue you’re facing, feel free to share the details—I’m here to help and want to make sure I provide the best support for you.

busy-monitor-72069

09/12/2025, 9:57 PM

i mean in stt-llm-tts pipeline

tall-airline-16756

09/12/2025, 9:58 PM

To help you reduce chunk size in the STT-LLM-TTS pipeline, the best approach depends on which TTS plugin you're using. Let me guide you through the options: • For ElevenLabs TTS: You can adjust the

chunk_length_schedule

parameter to set your desired chunk sizes. Here's a simple example:

Copy code

from livekit.plugins import elevenlabs

tts_instance = elevenlabs.TTS(
    chunk_length_schedule=[50, 60, 70, 80]  # Reduced chunk sizes for lower latency
)

By default, the chunk sizes are larger, but tweaking them like this can help lower latency and improve responsiveness. • For other TTS plugins (like Rime): Chunk sizes are based on sentence boundaries. Here are a few ways you can manage this: • Keep sentences shorter in your LLM output. • Adjust punctuation within the text stream to influence chunking. • Use a

StreamAdapter

combined with a tokenizer such as

blingfire.SentenceTokenizer

to have more control over how text is segmented. Here’s a quick example to illustrate:

Copy code

from livekit.agents.tts import StreamAdapter
from livekit.agents.tokenize import blingfire

tts = rime.TTS(model="arcana", speaker="astra")
tokenizer = blingfire.SentenceTokenizer()
tts_wrapped = StreamAdapter(tts=tts, sentence_tokenizer=tokenizer)
stream = tts_wrapped.stream()

• Custom Chunking: If you want full control, you can override the

tts_node

in a custom

Agent

class to create your own chunking logic. This method works well across all TTS providers. Please feel free to reach out if you’d like me to walk you through any of these steps or if you have any other questions. I’m here to help! Sources: ElevenLabs integration guide | LiveKit Docs | examples/other/text-to-speech/rime_tts.py | pipeline-tts/short_replies_only.py | Docs::Agents | Pipeline nodes and hooks | LiveKit Docs

busy-monitor-72069

09/12/2025, 10:00 PM

how to reduce chunk size for stt and llm? will it reduce latency

tall-airline-16756

09/12/2025, 10:01 PM

I'd love to help you with this! Could you please share which STT (speech-to-text) and LLM (language model) components you're using in your pipeline? For instance, are you working with a specific plugin or a custom setup? Knowing this will really help me guide you better on how to reduce chunk size and understand its effect on latency.

busy-monitor-72069

09/12/2025, 10:01 PM

i am using gemini flash 2.5 for llm and deepgram nova 3 for stt

tall-airline-16756

09/12/2025, 10:02 PM

I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!

Open in Slack

Previous Next