How to send the DTMF keys to the llm, skipping the...
# ask-ai
a
How to send the DTMF keys to the llm, skipping the stt
t
Thanks for reaching out, Dvir. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Could you please provide more details about your use case? • Are you using LiveKit's SDK in a specific programming language? • Are you integrating with any speech-to-text (STT) services currently? • What exactly do you mean by sending DTMF keys to the "llm"? Do you mean sending them as signaling within a LiveKit room or something else?
a
yes, livekit python, I'm trying to have the first message be from the DTMF but others from the stt model (deepgram)
t
Hey there! I totally get what you're looking to do with sending a DTMF key as the first message in LiveKit. Let me break this down in a friendly way. 😊 You'll want to use the
publish_dtmf
method on your
local_participant
in the LiveKit room. It's super straightforward for sending that initial DTMF signal. For the audio messages that follow, you can set up the Deepgram STT model in your agent session to handle speech-to-text transcription. The cool part is creating a smart workflow where you skip STT for the first DTMF message and then enable it for everything after. Here's a quick roadmap: • Connect to your LiveKit room using the Python SDK. • Send that first DTMF code with
await room.local_participant.publish_dtmf(code=1, digit='1')
. • Get Deepgram STT ready for processing subsequent audio. • Set up an agent session that listens for audio and uses Deepgram STT for transcription after the initial DTMF. Just a heads up, you'll need to manage your app's state to handle that first message differently. But don't worry, it's totally doable! Want me to walk you through a detailed code example? I'd be happy to help you nail this down! šŸš€ Sources: AI Agents Framework | livekit/livekit_composite | DeepWiki | complex-agents/ivr-agent/README.md | Handling DTMF | LiveKit Docs | Deepgram STT integration guide | LiveKit Docs | livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py | Deepgram integration guide | LiveKit Docs | ivr-nagivator/README.mdoc
a
can you show me the snippet? I don't want to publich the dtmf, i want the llm to get it as i said the dtmf code
t
Hey there! I totally understand your challenge with sending DTMF codes in LiveKit. While there's no built-in direct method to send control data to the LLM and skip speech-to-text, I've got a neat workaround that might just solve your problem. Check out this Python code snippet that shows how you can cleverly use
publish_data
to send DTMF codes as custom data, bypassing the speech-to-text process entirely:
Copy code
python # Sending DTMF code as custom data, not as audio or standard DTMF tone await room.local_participant.publish_data( data=b"1", # DTMF code "1" sent as bytes topic="dtmf_code" )
The cool part? Your system can have the LLM or your custom logic listen to the
dtmf_code
topic in the room data messages and handle it directly. No speech-to-text needed! Just keep in mind that you'll need to do some custom handling since LiveKit doesn't automatically route these data messages. You'll be in control of how these messages get interpreted and processed in your backend or agent logic. Want me to help you dive deeper into this approach? I'm totally here to help! Sources: test/integration/sip_test.go | ivr-nagivator/README.mdoc | packages/livekit-rtc/src/participant.ts
a
not clear
t
Hey there! Let me break down how LiveKit handles DTMF tones in a way that's super straightforward. 😊 Just so we're clear, LiveKit doesn't automatically send DTMF tones directly to your LLM or skip speech-to-text processing. But don't worry – I've got a neat workaround for you! Here's a friendly approach: you can send the DTMF code as a custom data message in the room, which your application or agent can easily catch and forward to the LLM. Let me walk you through a simple step-by-step guide: 1. Use
publish_data
on the local participant to send the DTMF digit as a data message with a topic like
"dtmf_code"
. 2. Have your agent or backend keep an ear out for data messages in the room. 3. When a message pops up on the
"dtmf_code"
topic, your code can send it directly to the LLM. 4. For all other audio messages, just use your usual speech-to-text pipeline (like Deepgram) as you normally would. Here's a quick Python code snippet to show you how to send the DTMF code as data:
Copy code
await room.local_participant.publish_data(
    data=b"1",  # DTMF digit as bytes
    topic="dtmf_code"
)
The cool part? This way, the DTMF digit isn't published as audio and doesn't go through speech-to-text. Instead, it arrives as a data message that your application can handle super easily. Would you like me to help you figure out how to listen for these data messages and forward them to your LLM in Python? I'm totally happy to help! šŸ‘
a
no help what so ever
t
I don't have the answer you're looking for. You could also try asking your question: • in one of the other Slack channels or • to https://deepwiki.com/livekit/livekit_composite which is trained on all LiveKit source code If you find the answer, please post it here to help others!