elegant-noon-41140
09/11/2025, 6:18 PMtall-airline-16756
09/11/2025, 6:18 PMtall-airline-16756
09/11/2025, 6:20 PMtall-airline-16756
09/11/2025, 6:26 PMstt_node
Method:
By overriding the stt_node
method in your agent class, you can handle speech-to-text (STT) events directly. This way, you can check the incoming transcriptions and look out for particular keywords.
async def stt_node(self, text: AsyncIterable[str], model_settings: Optional[dict] = None):
parent_stream = super().stt_node(text, model_settings)
keywords = ["quiz"]
async def process_stream():
async for event in parent_stream:
if hasattr(event, 'type') and str(event.type) == "SpeechEventType.FINAL_TRANSCRIPT" and event.alternatives:
transcript = event.alternatives[0].text
for keyword in keywords:
if keyword.lower() in transcript.lower():
# Trigger your frontend action here
print(f"Keyword detected: '{keyword}'")
yield event
return process_stream()
2. Trigger Frontend Actions:
Once your agent picks up the keyword (like "quiz"), you can have it trigger a custom action—maybe sending a message to your frontend app. This could be done via WebSocket, HTTP requests, or whichever communication method works best for you.
3. Optional: Use Keyword Boosting (Google STT):
If you’re using Google STT, there’s a handy feature to boost recognition accuracy for specific keywords by giving them boost values. Just remember, you’ll still want to process the output in stt_node
to detect and respond to those keywords.
stt=google.STT(keywords=[("quiz", 2.0)])
If you’d like, I’m here to help you with any part of implementing this—just let me know!
Sources: Pipeline nodes and hooks | LiveKit Docs | Pipeline nodes and hooks | LiveKit Docs | Docs::Agents | livekit-agents/livekit/agents/voice/agent_activity.py | Function calling with Voice Agents | LiveKit Docselegant-noon-41140
09/11/2025, 6:36 PMtall-airline-16756
09/11/2025, 6:37 PMAsyncIterable
isn’t defined in the global namespace by default. No worries, though! To fix this, you just need to import it from the typing
module. You can do this by adding the following line at the very top of your Python file:
from typing import AsyncIterable
This little step will make sure AsyncIterable
is recognized and ready to use in your type hints. If you run into any other questions or need a hand, feel free to reach out—I’m here to help!
Sources: Text-to-speech (TTS) integrations | LiveKit Docs | Pipeline nodes and hooks | LiveKit Docs | Docs::Agents | Pipeline nodes and hooks | LiveKit Docstall-airline-16756
09/11/2025, 6:37 PMAsyncIterable
isn’t defined in the global namespace by default. No worries though! To fix this, you just need to import it from the typing
module. You can do this by adding the following line at the very top of your Python file:
from typing import AsyncIterable
This little step will make sure AsyncIterable
is recognized and ready to use in your type hints. If you run into any other bumps along the way, feel free to reach out—I’m here to help!elegant-noon-41140
09/11/2025, 6:39 PMtall-airline-16756
09/11/2025, 6:40 PMOptional
isn’t defined in the global namespace by default. No worries though! To fix this, you just need to import it from the typing
module. You can do this by adding the following line at the very top of your Python file:
from typing import Optional
This little step will make sure Optional
is recognized and ready to use in your type hints. If you run into any other questions or need a hand with something else, feel free to reach out—I’m here to help!elegant-noon-41140
09/11/2025, 6:44 PMtall-airline-16756
09/11/2025, 6:44 PMimport logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
Then, log the transcript to see exactly what’s being processed:
<http://logger.info|logger.info>(f"Transcript received: '{transcript}'")
2. Verify Keyword Matching Logic:
Next, double-check that your keyword detection logic is spot on. For example:
if "quiz" in transcript.lower():
<http://logger.info|logger.info>("Keyword 'quiz' detected!")
This will help confirm if the spoken word matches what you’re looking for.
3. Check STT Output:
It’s important to make sure your Speech-to-Text (STT) engine is providing final transcripts. You can add some logging like this:
if hasattr(event, 'type') and str(event.type) == "SpeechEventType.FINAL_TRANSCRIPT":
<http://logger.info|logger.info>("Final transcript received.")
4. Adjust STT Configuration:
If you’re working with Deepgram, double-check your configuration. For instance, Nova-3 models use keyterms
instead of `keywords`:
stt=deepgram.STT(keyterms=["quiz"])
If any of these steps feel tricky or you want me to help you dig deeper, just let me know—I’m here to support you!
Sources: README.md | livekit-agents/livekit/agents/voice/agent_activity.py | Pipeline nodes and hooks | LiveKit Docstall-airline-16756
09/11/2025, 6:45 PMimport logging
logging.basicConfig(level=logging.DEBUG)
This will give you a clear view of all events and transcripts being processed, which is super helpful.
2. Log Transcripts:
Next, let's log the actual transcripts you receive to check if the keyword "quiz" is appearing:
<http://logger.info|logger.info>(f"Transcript received: '{transcript}'")
Try placing this inside your stt_node
method where the transcripts are handled.
3. Check Keyword Matching:
Make sure your keyword detection logic is spot on. For example:
if "quiz" in transcript.lower():
<http://logger.info|logger.info>("Keyword 'quiz' detected!")
Double-check that the spoken audio contains the exact word "quiz" or something very close to it.
4. Verify STT Output:
It's important to confirm that your STT engine is returning final transcripts. Usually, keywords show up in FINAL_TRANSCRIPT
events. You can log the event type like this:
<http://logger.info|logger.info>(f"Event type: {event.type}")
5. Adjust STT Configuration:
If you're using a specific STT provider like Deepgram, make sure it's set up to recognize your keyword. For instance, with Deepgram Nova-3 models, you can specify keyterms like this:
stt=deepgram.STT(keyterms=["quiz"])
Give these steps a try, and please feel free to reach out if you need any more help. I'm here to support you!elegant-noon-41140
09/11/2025, 6:52 PMtall-airline-16756
09/11/2025, 6:52 PMdeepgram
module hasn't been imported in your code yet. No worries—let's walk through the steps together to get this sorted out:
1. Install the Deepgram Plugin:
First, make sure the Deepgram plugin is installed by running this command:
pip install "livekit-agents[deepgram]~=1.2"
2. Import the Deepgram Module:
Next, add this import statement at the very top of your Python file:
from livekit.plugins import deepgram
3. Set the API Key:
Don't forget to set your Deepgram API key as an environment variable like this:
DEEPGRAM_API_KEY=your_api_key_here
Once you've done these steps, you should be all set to use deepgram.STT
without any issues. If you run into any trouble or have more questions, please feel free to reach out—I'm here to help!
Sources: agent.py | pipeline-tts/elevenlabs_change_language.py | Deepgram TTS integration guide | LiveKit Docs | Deepgram STT integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/__init__....tall-airline-16756
09/11/2025, 6:53 PMdeepgram
module hasn’t been imported in your code yet. No worries—let’s walk through the steps to get this sorted out together:
1. Install the Deepgram Plugin:
First, make sure the LiveKit Deepgram plugin is installed by running this command:
pip install "livekit-agents[deepgram]~=1.2"
2. Import the Deepgram Module:
Next, add this import statement right at the top of your Python file:
from livekit.plugins import deepgram
3. Set the API Key:
Don’t forget to set your Deepgram API key as an environment variable like this:
DEEPGRAM_API_KEY=your_api_key_here
Once you’ve done these steps, you should be all set to use deepgram.STT
without any hiccups. If you run into any other questions or need a hand, just let me know—I’m here to help!tall-airline-16756
09/11/2025, 7:02 PMimport logging
logging.basicConfig(level=<http://logging.INFO|logging.INFO>)
Also, instead of using print()
, try using <http://logger.info|logger.info>()
for better logging—it really helps keep things organized.
2. Debug the stt_node
Method:
Let’s ensure your stt_node
method is correctly handling the transcripts. Here’s an updated version you can try:
async def stt_node(self, text: AsyncIterable[str], model_settings: Optional[dict] = None):
parent_stream = super().stt_node(text, model_settings)
if parent_stream is None:
return None
keywords = ["quiz"]
async def process_stream():
async for event in parent_stream:
if hasattr(event, 'type') and str(event.type) == "SpeechEventType.FINAL_TRANSCRIPT" and event.alternatives:
transcript = event.alternatives[0].text
<http://logger.info|logger.info>(f"Transcript received: '{transcript}'") # Log the transcript
for keyword in keywords:
if keyword.lower() in transcript.lower():
<http://logger.info|logger.info>(f"Keyword detected: '{keyword}'") # Log keyword detection
yield event
return process_stream()
3. Test with Actual Speech:
When you test, try saying the word "quiz" clearly and see if it shows up in the transcript. If it doesn’t, it might be worth double-checking your STT engine settings.
4. Verify STT Configuration:
If you’re using Deepgram, make sure it’s properly set up. For example:
stt = deepgram.STT(keyterms=["quiz"])
Also, confirm that your Deepgram API key is correctly set as an environment variable.
If you run into any issues or want to explore this further, please don’t hesitate to reach out—I’m here to help!
Sources: Docs::Agents | Deepgram integration guide | LiveKit Docs | Deepgram integration guide | LiveKit Docs | home_assistant/homeautomation.py | Pipeline nodes and hooks | LiveKit Docstall-airline-16756
09/11/2025, 7:02 PMimport logging
logging.basicConfig(level=<http://logging.INFO|logging.INFO>)
2. Debug the stt_node
Method:
Try updating your stt_node
method to log the transcript so you can see if the keyword "quiz" is being detected:
async def stt_node(self, text: AsyncIterable[str], model_settings: Optional[dict] = None):
parent_stream = super().stt_node(text, model_settings)
keywords = ["quiz"]
async def process_stream():
async for event in parent_stream:
if hasattr(event, 'type') and str(event.type) == "SpeechEventType.FINAL_TRANSCRIPT" and event.alternatives:
transcript = event.alternatives[0].text
print(f"Transcript received: {transcript}") # Log the transcript
for keyword in keywords:
if keyword.lower() in transcript.lower():
print(f"Keyword detected: '{keyword}'") # Print detected keyword
yield event
return process_stream()
3. Test with Speech Containing "Quiz":
Make sure the audio you’re testing with actually includes the word "quiz" and check if it shows up in the logged transcript.
4. Check STT Configuration:
If you’re using Deepgram, double-check that it’s set up correctly. For example:
stt = deepgram.STT(keyterms=["quiz"])
Just a heads-up: the keyterms
parameter helps improve recognition but doesn’t directly trigger actions. The keyword detection happens in your stt_node
method.
If you run into any issues or want me to walk through any part in more detail, please don’t hesitate to ask—I’m here to help!