My agent don t react on text audio only How to fix it async LiveKit Community #ask-ai

My agent don't react on text (audio only) How to f...

faint-finland-12341

02/24/2025, 8:49 PM

My agent don't react on text (audio only) How to fix it? async def entrypoint(ctx: JobContext): initial_ctx = ChatContext().append( role="system", text="Use this test version app" ) await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY) fnc_ctx = AssistantFnc() assistant = VoiceAssistant( vad=silero.VAD.load(), stt=openai.STT(), llm=openai.LLM(), tts=openai.TTS(), chat_ctx=initial_ctx, fnc_ctx=fnc_ctx ) assistant.start(ctx.room)

gray-monkey-83525

02/24/2025, 8:49 PM

It seems likely that your agent is set up to only handle audio inputs due to the

auto_subscribe=AutoSubscribe.AUDIO_ONLY

setting. This configuration means the agent is not listening for text inputs. To enable text interaction, you might need to adjust the subscription settings or explicitly handle text messages in your code. Would you like more details on how to modify your setup to handle text inputs as well? Sources: - https://docs.livekit.io/python/livekit/agents/pipeline/pipeline_agent.html#args - https://docs.livekit.io/python/livekit/agents/pipeline/index.html#args 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

faint-finland-12341

02/24/2025, 8:51 PM

@millions-hair-24956 I'd like more detailse how to fix that in code

gray-monkey-83525

02/24/2025, 8:51 PM

To enable your agent to handle both audio and text inputs, you need to modify your setup to listen for text messages. Here's a step-by-step guide to achieve this: 1. Modify Subscription Settings: Ensure that your agent is set up to handle both audio and text. You can do this by not restricting the subscription to audio only. 2. Handle Text Messages: Implement a mechanism to listen for text messages and process them. You can use a

ChatManager

to listen for incoming chat messages. Here's an example of how you can modify your code to handle text inputs:

Copy code

import asyncio
from livekit import rtc
from livekit.agents import JobContext, WorkerOptions, cli
from livekit.agents.llm import ChatContext
from livekit.agents.voice_assistant import VoiceAssistant
from livekit.plugins import silero, openai

async def entrypoint(ctx: JobContext):
    initial_ctx = ChatContext().append(
        role="system",
        text="Use this test version app"
    )

    await ctx.connect()  # Remove the AUDIO_ONLY restriction

    fnc_ctx = AssistantFnc()

    assistant = VoiceAssistant(
        vad=silero.VAD.load(),
        stt=openai.STT(),
        llm=openai.LLM(),
        tts=openai.TTS(),
        chat_ctx=initial_ctx,
        fnc_ctx=fnc_ctx
    )

    assistant.start(ctx.room)

    # Set up a ChatManager to handle text messages
    chat = rtc.ChatManager(ctx.room)

    @chat.on("message_received")
    def on_chat_received(msg: rtc.ChatMessage):
        if msg.message:
            asyncio.create_task(answer_from_text(msg.message))

    async def answer_from_text(txt: str):
        chat_ctx = assistant.chat_ctx.copy()
        chat_ctx.append(role="user", text=txt)
        stream = assistant.llm.chat(chat_ctx=chat_ctx)
        await assistant.say(stream)

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Explanation: - ChatManager: This is used to listen for incoming text messages. The

on_chat_received

function is triggered whenever a new text message is received. - answer_from_text: This function processes the text message by appending it to the chat context and generating a response using the LLM. By implementing these changes, your agent should be able to handle both audio and text inputs effectively. Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1732415232.661559 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1727139659.431669 👍 1 button 👎 0 button 💡 Hint: Mention @millions-hair-24956 in the thread for followups.

gray-monkey-83525

02/24/2025, 9:09 PM

I'm not able to find any data sources to answer your question. You might want to check if the

responses

dictionary is being correctly initialized and updated, and ensure that the

GDSQuestion

and `DEPRESSIVE_YES_ANSWERS`/`DEPRESSIVE_NO_ANSWERS` are correctly defined and used. If this doesn't help, please reach out to the Livekit support team for further assistance. 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

faint-finland-12341

02/25/2025, 9:22 PM

@millions-hair-24956 How can I indicate "yes", "no" answers and count it (agent will indicate it)?

gray-monkey-83525

02/25/2025, 9:22 PM

I'm not able to find any data sources to answer your question. You might want to provide more details on how the agent processes and stores responses, and how the

GDSQuestion

and response handling logic are implemented. If this doesn't help, please reach out to the Livekit support team for further assistance. 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

Open in Slack

Previous Next