explain how wordtokenizer and sentence tokenizer works in tr LiveKit Community #ask-ai

Join Slack

explain how wordtokenizer and sentence tokenizer w...

# ask-ai

acoustic-guitar-10279

03/28/2025, 9:20 AM

explain how wordtokenizer and sentence tokenizer works in transcription in voicepipeline

tall-airline-16756

03/28/2025, 9:20 AM

Thanks for reaching out, Harshita. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

03/28/2025, 9:20 AM

In the LiveKit Agents JS transcription pipeline, the word tokenizer and sentence tokenizer play distinct roles: • Sentence Tokenizer: This splits speech into sentences and determines when to mark a transcript as final for agent transcription. It is specified in the

AgentTranscriptionOptions

interface as

sentenceTokenizer: tokenize.SentenceTokenizer

. • Word Tokenizer: This splits speech into words and is used to simulate the "interim results" of the agent transcription. It is specified as

wordTokenizer: tokenize.WordTokenizer

. Both tokenizers are essential for processing and finalizing transcriptions effectively. Sources: AgentTranscriptionOptions | LiveKit Agents | livekit.agents.pipeline.pipeline_agent API documentation

acoustic-guitar-10279

03/28/2025, 9:22 AM

so is it use before llm

acoustic-guitar-10279

03/28/2025, 9:22 AM

or after llm in voicepipeline

tall-airline-16756

03/28/2025, 9:22 AM

Yes, the word tokenizer and sentence tokenizer are used before the content is passed to a large language model (LLM) in the transcription pipeline. Their roles are as follows: • Sentence Tokenizer: Determines when a transcript is finalized by identifying complete sentences. • Word Tokenizer: Breaks speech into individual words to simulate interim transcription results before a complete sentence is formed. These processes help structure the transcription data effectively before any further processing, such as by an LLM. Sources: livekit.agents.multimodal API documentation

tall-airline-16756

03/28/2025, 9:22 AM

I'm sorry, but I don't have the information you need at the moment. If you would like a member of the LiveKit team to respond, please ask your question in one of the other channels.

acoustic-guitar-10279

03/28/2025, 9:23 AM

agent = VoicePipelineAgent( vad=ctx.proc.userdata["vad"], stt=ctx.proc.userdata["stt"], llm=ctx.proc.userdata["llm"], tts=ctx.proc.userdata["tts"], # Using the prewarmed OpenAI TTS min_endpointing_delay=0.0001, interrupt_min_words=0, # noise_cancellation=noise_cancellation.NC(), # max_endpointing_delay=3, # allow_interruptions=True, # interrupt_speech_duration=0.5, chat_ctx=initial_ctx, preemptive_synthesis=True, # before_llm_cb=truncate_context, transcription=AgentTranscriptionOptions( user_transcription=True, # Enable user STT transcriptions agent_transcription=True, # Enable AI-generated transcriptions agent_transcription_speed=1.0, # Control transcription processing speed sentence_tokenizer=ctx.proc.userdata["sentence_tokenizer"], word_tokenizer=ctx.proc.userdata["word_tokenizer"], hyphenate_word=lambda word: word.replace("-", " "), # Optional: Modify word tokenization rules ), # before_tts_cb=_before_tts_cb, fnc_ctx=CallActions(api=ctx.api, participant=participant, room=ctx.room), ) is this the right way im using it

acoustic-guitar-10279

03/28/2025, 9:25 AM

what is iterium results

Open in Slack

Previous Next