For the testing and evaluation is there a way to have an OR LiveKit Community #ask-ai

Join Slack

For the testing and evaluation, is there a way to ...

# ask-ai

ancient-hospital-67205

08/05/2025, 6:21 PM

For the testing and evaluation, is there a way to have an OR assertion ? https://docs.livekit.io/agents/build/testing/

tall-airline-16756

08/05/2025, 6:21 PM

Thanks for reaching out, Cesar. I'm an AI assistant and I'm here to help. Give me a moment while I research this topic.

tall-airline-16756

08/05/2025, 6:21 PM

LiveKit does not have a direct OR assertion operator in its testing and evaluation framework. However, you can achieve similar functionality using the following approaches: • Use
contains
Methods: Combine methods like

contains_message()

and

contains_function_call()

to check for multiple possible events. • Indexed Access: Use event indexing to assert different conditions at specific positions. • Leverage `judge()`: Use the

judge()

method with an LLM to evaluate flexible conditions, including OR logic. • Chain Assertions: Write multiple assertions to cover different scenarios. For example, you can test for either a message greeting or a function call by chaining assertions or using the

judge()

method to describe the intent. If you need help implementing any of these approaches, let me know!

ancient-hospital-67205

08/05/2025, 6:22 PM

i wanna chain assertions, could you give me an exmaple to test for either a message or a function call?

tall-airline-16756

08/05/2025, 6:22 PM

Here’s a handy example showing how you can chain assertions to test for either a message or a function call:

Copy code

# Test for either a message or a function call
@pytest.mark.asyncio
async def test_agent_response():
    async with (
        openai.LLM(model="gpt-4o-mini") as llm,
        AgentSession(llm=llm) as session,
    ):
        await session.start(Assistant())
        result = await session.run(user_input="Hello")

        try:
            # First scenario: Check for a message
            result.expect.next_event().is_message(role="assistant")
        except AssertionError:
            try:
                # Second scenario: Check for a function call
                result.expect.next_event().is_function_call(name="greet")
            except AssertionError:
                # If neither scenario matches, fail the test
                raise AssertionError("Neither a message nor a function call was matched.")

This example first tries to see if the next event is a message from the assistant. If that doesn’t pan out, it checks for a function call named "greet." If neither fits, the test will let you know by failing. Feel free to ask if you want me to walk you through it more or if you'd like some extra examples!

ancient-hospital-67205

08/05/2025, 6:28 PM

doesnt that advances the cursor on the second scenario?

tall-airline-16756

08/05/2025, 6:28 PM

Hey there! So, when you use

next_event()

in chained assertions, it actually moves the cursor forward to the next event. That means if your first scenario doesn’t pass and you move on to the second one, the cursor might have already skipped the event you wanted to check. To get around this, you can use indexed access instead of

next_event()

. This way, you can directly point to specific events without moving the cursor. Here’s a little example:

Copy code

# Test for either a message or a function call without advancing the cursor
@pytest.mark.asyncio
async def test_agent_response():
    async with (
        openai.LLM(model="gpt-4o-mini") as llm,
        AgentSession(llm=llm) as session,
    ):
        await session.start(Assistant())
        result = await session.run(user_input="Hello")

        # Check the first event for either a message or a function call
        first_event = result.expect[0]
        try:
            first_event.is_message(role="assistant")
        except AssertionError:
            first_event.is_function_call(name="greet")

This way, you’re making sure you’re checking the same event for both conditions, without accidentally moving past it. If you want me to walk you through it more or need some extra examples, just let me know—I’m happy to help!

ancient-hospital-67205

08/05/2025, 9:25 PM

is there a way i can have tests run 10 times each and get statisics of how many times out of 10 each test failed?

tall-airline-16756

08/05/2025, 9:25 PM

Absolutely! You can run LiveKit tests as many times as you like and track the failure rates using the

tool_bench.py

script. It has a handy

-n

option where you can decide how many times each test runs (it defaults to 10). The script keeps track of both successes and failures, so you can easily see the failure rates across all those runs. If you want, I’m here to help you get it set up or explain anything else about using the script! Sources: tools/llama-bench/README.md

ancient-hospital-67205

08/05/2025, 10:07 PM

can we pass seeds to LLM ?

tall-airline-16756

08/05/2025, 10:08 PM

Absolutely! You can pass seeds to LLMs in LiveKit to get deterministic behavior. This is actually shown in the

test_completion.py

tests, where using the same seed helps ensure you get consistent results every time with the same inputs and settings. Plus, the Google Gemini LLM integration also supports a seed parameter to keep things predictable. If you'd like, I’m here to help you set this up—just let me know! Sources: tools/server/tests/unit/test_completion.py | livekit-plugins/livekit-plugins-anthropic/README.md

Open in Slack

Previous Next