Im following the LiveKit Docs to create a <basic A...
# ask-ai
r
Im following the LiveKit Docs to create a basic Agent that works with telephony in the v1.0 framework. I have defined an openai realtime agent like this:
Copy code
session = AgentSession(
        llm=openai.realtime.RealtimeModel(
            voice="coral",
        )
    )
I've been trying to capture the transcript of the call (both Agent and user) and outputting it at the end of the call, but I couldn't find anywhere in the docs and github examples on how to capture the transcript of the call. Is this feature available for multimodal agents? if so, is there any example on how to do it?
t
Thanks for reaching out, Santiago. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Thank you for your question! Capturing the transcript of a call (both Agent and user) is indeed possible in the LiveKit v1.0 framework for multimodal agents. Here are some key points to help you: • Transcriptions can be forwarded to clients automatically if your frontend supports receiving them. These transcriptions are delivered in segments, each with a unique ID, and you can monitor their completion using the
final
property. • For detailed guidance on creating transcriptions in the agent process, you can refer to the "Recording agent sessions" section in the documentation. • Additionally, the "Text and transcriptions" documentation provides updated information on handling transcriptions in v1.0. These resources should help you implement the functionality you need. If you have further questions or run into any issues, feel free to ask! Sources: Transcriptions | LiveKit Docs | Transcriptions | LiveKit Docs