mammoth-smartphone-64668
02/28/2025, 3:57 PMgray-monkey-83525
02/28/2025, 3:57 PMbefore_llm_cb
callback in the VoicePipelineAgent
to modify the chat context before it is sent to the LLM model. This allows you to inject your own transcriptions into the conversation flow.
According to the VoicePipelineAgent documentation, you can set up a before_llm_cb
callback to intercept and modify the chat context. This way, you can insert the text from your system into the chat context before it reaches the LLM.
Would you like more details on how to implement this?
Sources:
- VoicePipelineAgent documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.mammoth-smartphone-64668
02/28/2025, 4:00 PMgray-monkey-83525
02/28/2025, 4:00 PMVoicePipelineAgent
to avoid unnecessary transcription and associated costs. You could potentially create a custom subclass of the VoicePipelineAgent
that bypasses the STT process, or configure the agent to not use STT at all.
According to a Slack discussion, one approach is to remove or replace the TTS component with a no-op (no operation) subclass, which suggests a similar approach could be applied to the STT component. This would allow you to use your own transcription system without incurring costs from LiveKit's STT.
Would you like more detailed guidance on how to implement this?
Sources:
- Slack discussion 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.mammoth-smartphone-64668
02/28/2025, 4:06 PMgray-monkey-83525
02/28/2025, 4:06 PMVoicePipelineAgent
to exclude the STT component. Here's a step-by-step guide on how you might achieve this:
1. Subclass the VoicePipelineAgent: Create a custom subclass of VoicePipelineAgent
that omits the STT component. This involves overriding the initialization to exclude STT-related setup.
2. Modify the Initialization: In your subclass, remove or bypass the STT initialization. You can set the STT parameter to None
or a no-op equivalent if the framework allows.
3. Inject Custom Transcriptions: Use the before_llm_cb
callback to inject your transcriptions into the chat context. This callback allows you to modify the chat context before it is sent to the LLM.
4. Example Code:
from livekit.agents import VoicePipelineAgent
class CustomVoicePipelineAgent(VoicePipelineAgent):
def __init__(self, vad, llm, tts, chat_ctx=None, **kwargs):
super().__init__(vad=vad, stt=None, llm=llm, tts=tts, chat_ctx=chat_ctx, **kwargs)
def custom_before_llm_cb(agent, chat_ctx):
# Inject your transcription here
chat_ctx.append(role="user", text="Your custom transcription")
return chat_ctx
# Usage
agent = CustomVoicePipelineAgent(
vad=silero.VAD.load(),
llm=openai.LLM(),
tts=cartesia.TTS(),
before_llm_cb=custom_before_llm_cb
)
5. Test and Validate: Ensure that your custom agent correctly processes the transcriptions from your system and interacts with the LLM and TTS as expected.
This approach allows you to leverage your existing transcription system while still utilizing LiveKit's LLM and TTS capabilities. If you encounter any specific issues during implementation, feel free to ask for further assistance.
Sources:
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729597029.187099
- https://livekit-users.slack.com/archives/C07FY8WHGPM/p1729541882.418149 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.