I want to use text to text only without tts stt e....
# ask-ai
c
I want to use text to text only without tts stt e.g., but after turning off text stream does not work, I need to use text stream also as before
t
Thanks for reaching out, Babken. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
If you want to use text streams in LiveKit without involving TTS or STT, you can set up a text-only agent pretty easily. Here’s a quick guide to get you started: • Make sure your agent session includes just an LLM, without any TTS or STT parts. For example:
Copy code
from livekit.agents import (
    AgentSession,
    RoomInputOptions,
    RoomOutputOptions,
)

session = AgentSession(
    llm=openai.LLM(model="gpt-4o-mini"),
)
await session.start(
    room_input_options=RoomInputOptions(text_enabled=True, audio_enabled=False),
    room_output_options=RoomOutputOptions(transcription_enabled=True, audio_enabled=False),
)
For handling text communication: • Use the
sendText
method to send text input to the topic
lk.chat
. • Catch the agent’s responses by registering a handler with
registerTextStreamHandler
on the topic
lk.transcription
. This way, you get smooth text-only interaction without any audio involved. If you have any questions or want me to walk you through it, just let me know—I’m happy to help! Sources: examples/other/chat-stream-receiver.py | Text and transcriptions | LiveKit Docs | Text and transcriptions | LiveKit Docs | Sending text | LiveKit Docs
c
is it with text stream?
t
Yep, the setup I mentioned uses text streams to keep the conversation flowing. Here’s how it works: • You send text input with the
sendText
method to the
lk.chat
topic. • Agent responses come back through the
registerTextStreamHandler
method on the
lk.transcription
topic. This way, everything happens in real-time with text. If anything’s unclear or you want me to walk you through it more, just let me know—I’m here to help!
c
i did like this, but there is not stream
t
Could you share a bit more about your setup? For example: • Are you using the
sendText
method to send text to the
lk.chat
topic? • Have you set up the
registerTextStreamHandler
method to receive text from the
lk.transcription
topic? • Are you seeing any errors or anything unexpected happening? Knowing these details will really help me understand what's going on and support you better.
c
look, before it when I use tts, i use session.say, now what should I use to send stream from ai agent to livekit?
t
If you want to send a text stream from the AI agent to LiveKit without using TTS, you just need to set up the agent for text-only output. Here’s a simple way to do it: • Set
audio_enabled=False
in
RoomOutputOptions
to turn off audio output and enable text transcription. • The agent will then automatically publish text responses to the
lk.transcription
topic. Here’s an example to help you get started:
Copy code
async def entrypoint(ctx: JobContext):
    session = AgentSession(
        llm=openai.LLM(model="gpt-4o-mini"),
    )
    await session.start(
        agent=MyAgent(),
        room=ctx.room,
        room_input_options=RoomInputOptions(text_enabled=True, audio_enabled=False),
        room_output_options=RoomOutputOptions(transcription_enabled=True, audio_enabled=False),
    )
On the frontend side, you can set up a handler to catch these text streams like this:
Copy code
room.registerTextStreamHandler('lk.transcription', async (reader, participantInfo) => {
    const message = await reader.readAll();
    console.log(`New message from ${participantInfo.identity}: ${message}`);
});
This way, your AI agent will chat through text streams without any audio processing. If you have any questions or want a hand with this, just let me know—I’m here to help! Sources: livekit-plugins/livekit-plugins-playai/livekit/plugins/playai/tts.py | Text and transcriptions | LiveKit Docs
c
Yes I understand, but I ask you about session.say, I need to send chunk from ai agent to frontend via livekit, so how should I do it
t
If you want to send text chunks from the AI agent to the frontend via LiveKit without using TTS, you can easily set up the agent to output text only. Here’s a quick guide: • Just set
audio_enabled=False
in
RoomOutputOptions
to turn off audio and enable text transcription. • The agent will then automatically send text chunks as they’re generated to the
lk.transcription
topic. On the frontend side, you can use the
registerTextStreamHandler
method to catch these text chunks in real-time like this:
Copy code
room.registerTextStreamHandler('lk.transcription', async (reader, participantInfo) => {
    const message = await reader.readAll();
    console.log(`New message from ${participantInfo.identity}: ${message}`);
});
This way, the text chunks flow smoothly from the AI agent to your frontend as they happen. If you have any questions or want me to walk you through it, just let me know—I’m here to help! Sources: Text and transcriptions | LiveKit Docs | tests/fake_llm.py
c
should I use session.say?
t
No worries, you don’t need to use
session.say
for text-only communication. The agent will automatically send text chunks to the
lk.transcription
topic as they’re generated, as long as you’ve set up the session with
audio_enabled=False
in
RoomOutputOptions
. If you have any questions or want a hand with the setup, just give me a shout!
c
no, it's impossible, you dont understand me, I generate AI messages, and I want to send it to user (user asks agent, agent generate message and send to user)
t
Hey there! When you want to send AI-generated messages from the agent to the user using LiveKit, you’ve got two friendly options: stream the text bit by bit as it’s created, or send it all in one go. Here’s a quick look at how to stream text incrementally:
Copy code
# Stream AI response as it's generated
writer = await participant.stream_text(
    topic="lk.chat",
    total_size=len(full_response.encode()) if known
)

async for chunk in ai_text_stream:
    await writer.write(chunk)

await writer.aclose()
Or, if you prefer sending the whole response at once, here’s how you can do that:
Copy code
# Send entire AI response at once
await participant.send_text(
    ai_generated_text,
    topic="lk.chat",  # or custom topic
    destination_identities=["target_user_id"]  # optional: target specific users
)
These are great ways to get your AI messages right to your users through LiveKit’s text streams. If anything’s unclear or you want to dive deeper, just give me a shout—I’m happy to help! Sources: Text and transcriptions | LiveKit Docs | Sending text | LiveKit Docs | Sending text | LiveKit Docs | server/agent.py | examples/other/text_only.py
c
I use the first method, here is a part of my code:
Copy code
writer = await session._room_io._room.local_participant.stream_text(topic="lk.transcription")
            async for chunk in result:
                await writer.write(chunk)
            await writer.aclose()
t
Thanks so much for sharing your code! Could you help me understand a bit more about the issue you’re running into? For example: • Are you seeing any errors or exceptions when you run this? • Is the text stream not showing up on the frontend, or does it seem incomplete? • Have you checked if the
result
variable actually has the AI-generated text chunks you’re expecting? Knowing these details will really help me get to the bottom of this and find the right fix for you.
c
Copy code
if True:
            stt=NOT_GIVEN
            vad=NOT_GIVEN
            turn_detection=NOT_GIVEN
            tts=NOT_GIVEN
        else:
            stt=deepgram.STT(model="nova-3", language="multi")
            vad=ctx.proc.userdata["vad"]
            turn_detection=MultilingualModel()
            tts = elevenlabs.TTS(voice_id=project_voice_id, model="eleven_turbo_v2_5")

        session = AgentSession[ExpertUserdata](
            userdata=ExpertUserdata(
                expert_id=expert_id,
                conversation_id=conversation_id,
                user_id=user_id,
                language=main_language["long"],
                topics=topics,
                interviews_ids=interviews_ids,
                start_timestamp=int(datetime.utcnow().timestamp()*1000),
                end_timestamp=int(datetime.utcnow().timestamp()*1000),
                is_prod=is_prod,
                live_mode=live_mode,
                avatar_enabled=avatar_enabled
            ),
            stt=stt,
            llm=openai.LLM(model="gpt-4o-mini", temperature=0.5),
            tts=tts,
            vad=vad,
            turn_detection=turn_detection,
        )

        if True:
            room_io = RoomIO(session, room=ctx.room, output_options=RoomOutputOptions(audio_enabled=False, sync_transcription=False))
        else:
            room_io = RoomIO(session, room=ctx.room)

        def create_chat_message(item):
            return ChatMessage(role=item["role"], content=[item["content"]])
        
        await room_io.start()
        
        session.input.set_audio_enabled(False)


        await session.start(
            agent=ExpertAgent(chat_ctx=ChatContext(), live_mode=session.userdata.live_mode),
            room_input_options=RoomInputOptions(
                noise_cancellation=noise_cancellation.BVC(),
                text_enabled=True,
                audio_enabled=False,
            ),
            room_output_options=RoomOutputOptions(audio_enabled=False, sync_transcription=True, transcription_enabled=True),
            room=ctx.room,

        )
        try:
            rio = session._room_io
            rio._update_transcription_output(
                rio._agent_tr_output,
                ctx.room.local_participant.identity,
            )
            # Route session output transcription through RoomIO
            session.output.transcription = rio.transcription_output
here is the part of my code, but anyway on front there are not stream text, it is full text
t
Thanks so much for sharing your code! From what I see, the issue might be with how the text is being sent over to the frontend. To help make sure the text streams incrementally, could you double-check a few things? • Make sure the
result
variable in your earlier code is actually yielding chunks of text little by little. If the AI model is generating the entire text all at once, it won’t stream incrementally. • Take a look at the
stream_text
method and confirm that
await writer.write(chunk)
is being called for each chunk as it’s created. • On the frontend side, check that the
registerTextStreamHandler
method is properly set up to handle those incremental text updates coming from the
lk.transcription
topic. If the AI model is sending the full text in one go, you might need to tweak its settings to get it to output incrementally. Feel free to reach out if you want me to help you with that!
c
Copy code
room.on(RoomEvent.TranscriptionReceived, updateRawSegments);
    room.registerTextStreamHandler('lk.transcription', handleCustomTranscriptionsAgent);

    room.registerTextStreamHandler('lk.transcription.user', handleCustomTranscriptionsUser);
    room.on(RoomEvent.DataReceived, handleCustomState);
вот обработчики
Copy code
const handleCustomTranscriptionsAgent = async (reader, participantInfo) => {
    const message = await reader.readAll();
    console.log('handleCustomTranscriptions', <http://reader.info|reader.info>)
    if (reader.info.attributes['lk.transcribed_track_id']) {
      console.log(`New transcription from ${participantInfo.identity}: ${message}`);
    } else {
      console.log(`New message from ${participantInfo.identity}: ${message}`, rawSegments);
      updateRawSegments([{
        id: `SG_${Math.random().toString(16).slice(2, 14)}`, // или используйте uuid
        text: message,
        startTime: Date.now(),
        endTime: Date.now(),
        final: true,
        language: "", // если язык неизвестен
        firstReceivedTime: Date.now(),
        lastReceivedTime: Date.now(),
      }],
        {
          identity: participantInfo.identity,
          isAgent: true,
          isLocal: false,
        },
        {
          // можно добавить дату публикации или другие метаданные
        },
        true,
        true
      )
    }
  }
вот еще часть кода, но сюда сразу весь текст приходит
t
Thanks so much for sharing your frontend code! From what I see, the hiccup might be that you're using the
reader.readAll()
method, which grabs the whole text at once instead of bit by bit. To work with text chunks as they come in, it’s better to use
reader.read()
inside a loop rather than
readAll()
. Here’s a little tweak to your handler that should do the trick:
Copy code
const handleCustomTranscriptionsAgent = async (reader, participantInfo) => {
    console.log('handleCustomTranscriptions', <http://reader.info|reader.info>);
    let chunk;
    while ((chunk = await reader.read()) !== null) {
        console.log(`New chunk from ${participantInfo.identity}: ${chunk}`);
        updateRawSegments([{ 
            id: `SG_${Math.random().toString(16).slice(2, 14)}`,
            text: chunk,
            startTime: Date.now(),
            endTime: Date.now(),
            final: false,
            language: "",
            firstReceivedTime: Date.now(),
            lastReceivedTime: Date.now(),
        }],
        {
            identity: participantInfo.identity,
            isAgent: true,
            isLocal: false,
        },
        {},
        true,
        true);
    }
};
This way, you’ll be able to handle each piece of the text stream as it arrives. Give it a try, and feel free to reach out if you want to chat more or need a hand with anything else!
c
TypeError: reader.read is not a function at handleCustomTranscriptionsAgent
t
It looks like the
reader
object doesn’t support the
read
method, which usually means the text stream API in your setup might not handle incremental reading directly. Here are a couple of things you can check: • Make sure the LiveKit SDK version you’re using supports incremental text streaming. If it doesn’t, updating to the latest version might help. • See if the
reader
object offers another way to read chunks incrementally, like an async iterator or a different API. If incremental reading isn’t available, you might need to grab the full text first and then split it into chunks yourself for processing. Feel free to reach out if you want more help or if you can share a bit more about your LiveKit SDK version and setup—I’m here to assist!
c
give me exact solution
t
I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!
c
is possible to turn off tts and stt via rpc during session?