I am using the voice pipeline agent. But I already...
# ask-ai
m
I am using the voice pipeline agent. But I already have a system that gets the live transcript. I don't want to use LiveKit's STT. I want to be able to manually feed in the text since it's coming from my other system, but then still use the LiveKit agent for the LLM and TTS. Is this possible? @millions-hair-24956
g
Yes, it is possible to use your own system for live transcription and then feed the text into the LiveKit agent for further processing with LLM and TTS. You can achieve this by using the
before_llm_cb
callback in the
VoicePipelineAgent
to modify the chat context before it is sent to the LLM model. This allows you to inject your own transcriptions into the conversation flow. According to the VoicePipelineAgent documentation, you can set up a
before_llm_cb
callback to intercept and modify the chat context. This way, you can insert the text from your system into the chat context before it reaches the LLM. Would you like more details on how to implement this? Sources: - VoicePipelineAgent documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
But if I'm just throwing out the LiveKit transcript then isn't LiveKit doing all that transcription for nothing? It's duplicating the effort and wasting money transcribing when I'm just going to throw it out and use the transcription from my other system. @millions-hair-24956
g
It might be possible to disable the STT component in the
VoicePipelineAgent
to avoid unnecessary transcription and associated costs. You could potentially create a custom subclass of the
VoicePipelineAgent
that bypasses the STT process, or configure the agent to not use STT at all. According to a Slack discussion, one approach is to remove or replace the TTS component with a no-op (no operation) subclass, which suggests a similar approach could be applied to the STT component. This would allow you to use your own transcription system without incurring costs from LiveKit's STT. Would you like more detailed guidance on how to implement this? Sources: - Slack discussion 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
m
Yes please, I would like more detailed guidance on how to implement this @millions-hair-24956
g
To implement a solution where you bypass LiveKit's STT and use your own transcription system, you can customize the
VoicePipelineAgent
to exclude the STT component. Here's a step-by-step guide on how you might achieve this: 1. Subclass the VoicePipelineAgent: Create a custom subclass of
VoicePipelineAgent
that omits the STT component. This involves overriding the initialization to exclude STT-related setup. 2. Modify the Initialization: In your subclass, remove or bypass the STT initialization. You can set the STT parameter to
None
or a no-op equivalent if the framework allows. 3. Inject Custom Transcriptions: Use the
before_llm_cb
callback to inject your transcriptions into the chat context. This callback allows you to modify the chat context before it is sent to the LLM. 4. Example Code:
Copy code
from livekit.agents import VoicePipelineAgent

   class CustomVoicePipelineAgent(VoicePipelineAgent):
       def __init__(self, vad, llm, tts, chat_ctx=None, **kwargs):
           super().__init__(vad=vad, stt=None, llm=llm, tts=tts, chat_ctx=chat_ctx, **kwargs)

   def custom_before_llm_cb(agent, chat_ctx):
       # Inject your transcription here
       chat_ctx.append(role="user", text="Your custom transcription")
       return chat_ctx

   # Usage
   agent = CustomVoicePipelineAgent(
       vad=silero.VAD.load(),
       llm=openai.LLM(),
       tts=cartesia.TTS(),
       before_llm_cb=custom_before_llm_cb
   )
5. Test and Validate: Ensure that your custom agent correctly processes the transcriptions from your system and interacts with the LLM and TTS as expected. This approach allows you to leverage your existing transcription system while still utilizing LiveKit's LLM and TTS capabilities. If you encounter any specific issues during implementation, feel free to ask for further assistance. Sources: - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729597029.187099 - https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729541882.418149 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.