I want to build an interaction between my frontend...
# ask-ai
b
I want to build an interaction between my frontend and a voice agent. specifically, making it easy for a user to share clipboard data to be added to the agent's context. I'm using a text stream to send the clipboard data:
Copy code
async def add_chat_message(role: str, content: str):
        chat_ctx = agent.chat_ctx.copy()
        chat_ctx.add_message(
            role=role,
            content=content,
        )
        await agent.update_chat_ctx(chat_ctx)

    async def handle_clipboard_text(reader: TextStreamReader):
        try:
            clipboard_text = await reader.read_all()
            add_clipboard_text(ctx.room.name, clipboard_text) # db persistence
            await add_chat_message(
                role="system",
                content=f"The user provided a snippet from their clipboard:\n{clipboard_text}.",
            )
        except Exception as e:
            await add_chat_message(
                role="system",
                content=str(e)
            )
        finally:
            await session.generate_reply(
                instructions="Proceed with the workflow."
            )

    def text_stream_handler(reader: TextStreamReader, _: str):
        match reader.info.topic:
            case "clipboard":
                task = asyncio.create_task(handle_clipboard_text(reader))
                _active_tasks.add(task)
                task.add_done_callback(lambda t: _active_tasks.remove(t))
            case _:
                pass

    ctx.room.register_text_stream_handler(
        "clipboard",
        text_stream_handler,
    )
The agent does successfully receive the clipboard data. However, it doesn't resume the workflow. The agent doesn't respond unless spoken to again. Am I using the correct pattern here?
t
Thanks for reaching out, Don. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Your implementation is close, but to ensure the agent resumes its conversational workflow after receiving clipboard data through a text stream—and responds without requiring the user to speak again—you should confirm the following pattern: • Update the agent's context with the clipboard data as you are already doing. • Call
session.generate_reply()
with either the
user_input
parameter (preferred for direct user-like input)
or
instructions
(if you want to influence the agent's response more narrowly): ◦
await session.generate_reply(user_input=clipboard_text)
– This makes the clipboard content act as if the user just said it, and the agent will respond naturally. ◦
await session.generate_reply(instructions="Respond to the clipboard data that was just shared.")
– This instructs the agent to generate a reply, useful for more guidance or system-driven responses. This ensures the agent processes the text-streamed clipboard data and promptly generates a response, interrupting any idle state as needed. This is the standard pattern for agents integrating with LiveKit via text streams for use cases like yours. For most natural interaction, using
user_input
to provide the clipboard content is recommended. Reference: See documented patterns for context updates and text stream handling in agent workflows. Sources: Agent speech | LiveKit Docs | Text and transcriptions | LiveKit Docs