Is there a way to capture Deepgram's confidence in...
# ask-ai
r
Is there a way to capture Deepgram's confidence in transcription?
t
Thanks for reaching out, Trevor. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.
Yes, Deepgram's confidence in transcription can be captured. The confidence value is included in the
SpeechData
object, specifically in the
confidence
field. This value is extracted from Deepgram's transcription results and can be accessed programmatically. If you need further details on how to implement this, feel free to ask! Sources: Tree View: | dse-workbench | Tree View: | dse-workbench
r
Can I add it to session.history?
t
The available documentation does not specifically mention adding Deepgram's confidence values to
session.history
. However, if you have access to the session object and its history, you could potentially include the confidence values as part of the data you store. This would depend on how your application is structured and whether the session history supports custom data fields. If you need further assistance with implementation, feel free to provide more details!
r
can you give an example of accessing the confidence at runtime?
t
Here is an example of accessing Deepgram's confidence at runtime:
Copy code
# From live transcription
dg_alts = data["channel"]["alternatives"]
for alt in dg_alts:
    sd = stt.SpeechData(
        language=language,
        start_time=alt["words"][0]["start"] if alt["words"] else 0,
        end_time=alt["words"][-1]["end"] if alt["words"] else 0,
        confidence=alt["confidence"],
        text=alt["transcript"],
    )
In this example, the confidence value is extracted from the alternative object returned by Deepgram as
alt["confidence"]
. You can then use this value as needed in your application. Let me know if you need further clarification! Sources: Tree View: | dse-workbench | Tree View: | dse-workbench
r
where do I addess data.channel.alternatives though?
t
You can access
data.channel.alternatives
in the Deepgram transcription response when processing the transcription data. For example, in the
live_transcription_to_speech_data
function, the alternatives are extracted as follows:
Copy code
def live_transcription_to_speech_data(language: str, data: dict) -> List[stt.SpeechData]:
    dg_alts = data["channel"]["alternatives"]
    
    speech_data = []
    for alt in dg_alts:
        sd = stt.SpeechData(
            language=language,
            start_time=alt["words"][0]["start"] if alt["words"] else 0,
            end_time=alt["words"][-1]["end"] if alt["words"] else 0,
            confidence=alt["confidence"],
            text=alt["transcript"],
        )
        if language == "multi" and "languages" in alt:
            sd.language = alt["languages"][0]  # TODO: handle multiple languages
        speech_data.append(sd)
    return speech_data
This function processes the transcription data and extracts the alternatives from the
data.channel.alternatives
field. You can adapt this approach to your specific use case. Let me know if you need further assistance! Sources: Tree View: | dse-workbench | Tree View: | dse-workbench