red-agent-48266
09/13/2025, 12:50 PMtall-airline-16756
09/13/2025, 12:50 PMtall-airline-16756
09/13/2025, 12:51 PM@livekit/agents-plugin-google
package. This wonderful feature allows for smooth, low-latency, two-way interactions combining text, audio, and video inputs with audio and text outputs via the Gemini Live API.
To help you get started easily, here’s a simple step-by-step:
1. First, install the package using npm: npm install @livekit/agents-plugin-google
.
2. Then, use the RealtimeModel
class to tap into Gemini's powerful multimodal capabilities.
3. Next, you can configure options like voice selection, temperature, instructions, and modality settings (such as text_and_audio
, text_only
, or audio_only
), tailoring it to your needs.
4. Finally, authenticate using environment variables (GOOGLE_API_KEY
) or directly within the constructor.
For a more detailed walkthrough, I highly recommend checking out the gemini-playground repository. It’s packed with example implementations, configuration presets, and a user-friendly interface that beautifully showcases multimodal use cases.
If at any point you have questions or need a hand with anything, please don’t hesitate to reach out—I’m here to help!
Sources: plugins/google/README.md | Google Gemini integration guide | LiveKit Docs | web/src/components/auth.tsx | Google Gemini integration guide | LiveKit Docs