how to reduce chunk size in the pipeline
# ask-ai
b
how to reduce chunk size in the pipeline
t
Thanks for reaching out, Harsh. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Thanks for reaching out! Could you please help me understand a bit more about which pipeline you’re referring to? For instance, are you asking about media processing, data transfer, or maybe another specific pipeline in LiveKit? Also, if you have a particular use case or issue you’re facing, feel free to share the details—I’m here to help and want to make sure I provide the best support for you.
b
i mean in stt-llm-tts pipeline
t
To help you reduce chunk size in the STT-LLM-TTS pipeline, the best approach depends on which TTS plugin you're using. Let me guide you through the options: • For ElevenLabs TTS: You can adjust the
chunk_length_schedule
parameter to set your desired chunk sizes. Here's a simple example:
Copy code
from livekit.plugins import elevenlabs

tts_instance = elevenlabs.TTS(
    chunk_length_schedule=[50, 60, 70, 80]  # Reduced chunk sizes for lower latency
)
By default, the chunk sizes are larger, but tweaking them like this can help lower latency and improve responsiveness. • For other TTS plugins (like Rime): Chunk sizes are based on sentence boundaries. Here are a few ways you can manage this: • Keep sentences shorter in your LLM output. • Adjust punctuation within the text stream to influence chunking. • Use a
StreamAdapter
combined with a tokenizer such as
blingfire.SentenceTokenizer
to have more control over how text is segmented. Here’s a quick example to illustrate:
Copy code
from livekit.agents.tts import StreamAdapter
from livekit.agents.tokenize import blingfire

tts = rime.TTS(model="arcana", speaker="astra")
tokenizer = blingfire.SentenceTokenizer()
tts_wrapped = StreamAdapter(tts=tts, sentence_tokenizer=tokenizer)
stream = tts_wrapped.stream()
Custom Chunking: If you want full control, you can override the
tts_node
in a custom
Agent
class to create your own chunking logic. This method works well across all TTS providers. Please feel free to reach out if you’d like me to walk you through any of these steps or if you have any other questions. I’m here to help! Sources: ElevenLabs integration guide | LiveKit Docs | examples/other/text-to-speech/rime_tts.py | pipeline-tts/short_replies_only.py | Docs::Agents | Pipeline nodes and hooks | LiveKit Docs
b
how to reduce chunk size for stt and llm? will it reduce latency
t
I'd love to help you with this! Could you please share which STT (speech-to-text) and LLM (language model) components you're using in your pipeline? For instance, are you working with a specific plugin or a custom setup? Knowing this will really help me guide you better on how to reduce chunk size and understand its effect on latency.
b
i am using gemini flash 2.5 for llm and deepgram nova 3 for stt
t
I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!