What is the recommended approach for programatical...
# ask-ai
i
What is the recommended approach for programatically ending a call/voice agent session? Is it to simply delete the room?
t
Thanks for reaching out, Ed. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
To programmatically end a call or voice agent session in LiveKit, you have two options depending on your requirements: • Disconnecting the agent only: If you want to end just the agent's participation while allowing other participants to continue, you can use the
ctx.shutdown()
method. For example:
Copy code
async def entrypoint(ctx: JobContext):
    # do some work
    ...
    # disconnect from the room
    ctx.shutdown(reason="Session ended")
Disconnecting everyone (ending the entire session): If you want to end the session for all participants, you can use the server API's
deleteRoom
method. For example:
Copy code
from livekit import api

async def entrypoint(ctx: JobContext):
    # do some work
    ...
    api_client = api.LiveKitAPI(
        os.getenv("LIVEKIT_URL"),
        os.getenv("LIVEKIT_API_KEY"),
        os.getenv("LIVEKIT_API_SECRET"),
    )
    await api_client.room.delete_room(api.DeleteRoomRequest(
        room=ctx.job.room.name,
    ))
The
deleteRoom
method will send a "Disconnected" room event to all participants and remove the room from the server. Sources: Inside a session | LiveKit Docs
i
Does this recommendation change if the call is connected via SIP? To be more specific, lets say my voice agent makes an outbound phone call, and during this call the llm decides that the call has completed (whatever task the agent had has been accomplished and it should now hang up the phone call). What would be the recommended approach for performing that hang up/ending the call?
t
When handling a SIP call in LiveKit, the approach to ending the call depends on whether you want to disconnect just the voice agent or terminate the entire session: • Disconnecting the voice agent only: You can use the
ctx.shutdown()
method to disconnect the agent from the room while allowing other participants to continue. For example:
Copy code
async def entrypoint(ctx: JobContext):
    # do some work
    ...
    # disconnect from the room
    ctx.shutdown(reason="Session ended")
Removing a specific participant: If you want to remove the agent (or any specific participant) from the room, you can use the
remove_participant
API method. For example:
Copy code
async def hangup(self):
    try:
        await self.api.room.remove_participant(api.RoomParticipantIdentity(
            room=self.room.name,
            identity=self.participant.identity,
        ))
    except Exception as e:
        # it's possible that the user has already hung up, this error can be ignored
        <http://logger.info|logger.info>(f"received error while ending call: {e}")
Ending the call for everyone: If the task is complete and you want to terminate the entire session, you can use the server API's
deleteRoom
method. For example:
Copy code
from livekit import api

async def entrypoint(ctx: JobContext):
    # do some work
    ...
    api_client = api.LiveKitAPI(
        os.getenv("LIVEKIT_URL"),
        os.getenv("LIVEKIT_API_KEY"),
        os.getenv("LIVEKIT_API_SECRET"),
    )
    await api_client.room.delete_room(api.DeleteRoomRequest(
        room=ctx.job.room.name,
    ))
For a SIP call, the
remove_participant
method is particularly useful if you want to hang up the phone call for the agent while keeping the room active for other participants. Sources: Inside a session | LiveKit Docs | Making calls using SIP | LiveKit Docs
i
Does livekit use aggregators for managing context of an llm session? For example, what if i wanted to manually add something to an existing context (to be more specific, lets say the agent is having a conversation with a participant, and the participant says something that requires sending a manual message to the agent/llm, in this case i'd want to trigger some code to add a message to the context for the llm to act on)? Can you provide a code example? I am mainly interested in understanding how to access an existing running agent session.
t
In LiveKit, you can manually add context to an existing LLM session by interacting with the session's chat context. Here are the steps and a code example: • Appending to the Chat Context: Use the
append
method on a
ChatContext
object to add messages to the conversation context. For example:
Copy code
chatContext.append({
  role: ChatRole,
  text: "Your message text here",
  images: [optional image objects]
})
Accessing and Modifying an Existing Agent Session: You can retrieve and modify the chat context of an existing session using the following methods:
Copy code
# Get a copy of the current chat context
def chat_ctx_copy(self) -> llm.ChatContext:
    return self._session.chat_ctx_copy()

# Modify the context (e.g., append a message)
chat_ctx = agent.chat_ctx_copy()
chat_ctx.append({
    "role": "user",
    "text": "Manually added message"
})

# Update the session with the modified context
await agent.set_chat_ctx(chat_ctx)
This approach allows you to programmatically add messages or other context to the LLM session, enabling the agent to act on the new information. Let me know if you need further clarification! Sources: Vision Agent Quickstart | LiveKit Docs | ChatContext | LiveKit Agents | livekit.agents.multimodal API documentation
i
How would you do that dynamically? for example, in the example below, how would you access the context after the agent has already been deployed? Specifically what im concerned about is that cli.run_app takes up the entire process, is it possible to access the agent context after the agent has already been deployed?
import logging from dotenv import load_dotenv from livekit import rtc from livekit.agents import ( AutoSubscribe, JobContext, JobProcess, WorkerOptions, cli, llm, metrics, ) from livekit.agents.pipeline import VoicePipelineAgent from livekit.plugins import deepgram, openai, silero load_dotenv() logger = logging.getLogger("voice-assistant") def prewarm(proc: JobProcess): proc.userdata["vad"] = silero.VAD.load() async def entrypoint(ctx: JobContext): initial_ctx = llm.ChatContext().append( role="system", text=( "You are a voice assistant created by LiveKit. Your interface with users will be voice. " "You should use short and concise responses, and avoiding usage of unpronouncable punctuation." ), ) logger.info(f"connecting to room {ctx.room.name}") await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY) # wait for the first participant to connect participant = await ctx.wait_for_participant() logger.info(f"starting voice assistant for participant {participant.identity}") dg_model = "nova-3-general" if participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP: # use a model optimized for telephony dg_model = "nova-2-phonecall" agent = VoicePipelineAgent( vad=ctx.proc.userdata["vad"], stt=deepgram.STT(model=dg_model), llm=openai.LLM(), tts=openai.TTS(), chat_ctx=initial_ctx, ) agent.start(ctx.room, participant) usage_collector = metrics.UsageCollector() @agent.on("metrics_collected") def _on_metrics_collected(mtrcs: metrics.AgentMetrics): metrics.log_metrics(mtrcs) usage_collector.collect(mtrcs) async def log_usage(): summary = usage_collector.get_summary() logger.info(f"Usage: ${summary}") ctx.add_shutdown_callback(log_usage) await agent.say("Hello there! How can I help you today?", allow_interruptions=False) if name == "__main__": cli.run_app( WorkerOptions( entrypoint_fnc=entrypoint, prewarm_fnc=prewarm, ), )
t
Unfortunately, I couldn't find specific information on dynamically accessing an agent's context after deployment when using
cli.run_app
. This might require a deeper understanding of your specific implementation or additional system-level access to manage the agent's session dynamically. If this is a critical requirement, I recommend consulting with a LiveKit specialist or reviewing the LiveKit API documentation for advanced session management options. Let me know if there's anything else I can assist you with!
I'm sorry, but I don't have the information you need at the moment. If you would like a member of the LiveKit team to respond, please ask your question in one of the other channels.
i
Whats the recommended approach for savings transcriptions? Does livekit offer something off the shelf? If not, how can i add my own function to process and save transcriptions