abundant-father-33863
08/15/2025, 8:09 AMtall-airline-16756
08/15/2025, 8:09 AMtall-airline-16756
08/15/2025, 8:09 AMlivekit.AccessToken
class for this, making sure it includes the right grants for room access and publishing on behalf of someone else.
2. Launch the Agent: Next, use a function like launch_avatar
to send the connection details (like the room name, server URL, and token) over to the agent. This usually means sending a POST request to the avatar dispatcher.
3. Control TTS Output: Once the agent is in the room, they can use handy methods like session.say()
or generate_reply()
to manage the TTS output. These let the agent specify what to say and handle any interruptions smoothly.
Here’s a quick example of what the launch_avatar
function might look like:
async def launch_avatar(ctx: JobContext, avatar_dispatcher_url: str, avatar_identity: str):
token = (
api.AccessToken()
.with_identity(avatar_identity)
.with_name("Avatar Runner")
.with_grants(api.VideoGrants(room_join=True, room=ctx.room.name))
.with_kind("agent")
.with_attributes({ATTRIBUTE_PUBLISH_ON_BEHALF: ctx.token_claims().identity})
.to_jwt()
)
connection_info = AvatarConnectionInfo(room_name=ctx.room.name, url=ctx._info.url, token=token)
async with httpx.AsyncClient() as client:
response = await <http://client.post|client.post>(avatar_dispatcher_url, json=asdict(connection_info))
response.raise_for_status()
If you want to dive deeper, check out the LiveKit docs on Agent Speech. They’ve got some great info to help you along!