ancient-hospital-67205
08/05/2025, 6:21 PMtall-airline-16756
08/05/2025, 6:21 PMtall-airline-16756
08/05/2025, 6:21 PMcontains Methods: Combine methods like contains_message() and contains_function_call() to check for multiple possible events.
• Indexed Access: Use event indexing to assert different conditions at specific positions.
• Leverage `judge()`: Use the judge() method with an LLM to evaluate flexible conditions, including OR logic.
• Chain Assertions: Write multiple assertions to cover different scenarios.
For example, you can test for either a message greeting or a function call by chaining assertions or using the judge() method to describe the intent.
If you need help implementing any of these approaches, let me know!ancient-hospital-67205
08/05/2025, 6:22 PMtall-airline-16756
08/05/2025, 6:22 PM# Test for either a message or a function call
@pytest.mark.asyncio
async def test_agent_response():
async with (
openai.LLM(model="gpt-4o-mini") as llm,
AgentSession(llm=llm) as session,
):
await session.start(Assistant())
result = await session.run(user_input="Hello")
try:
# First scenario: Check for a message
result.expect.next_event().is_message(role="assistant")
except AssertionError:
try:
# Second scenario: Check for a function call
result.expect.next_event().is_function_call(name="greet")
except AssertionError:
# If neither scenario matches, fail the test
raise AssertionError("Neither a message nor a function call was matched.")
This example first tries to see if the next event is a message from the assistant. If that doesn’t pan out, it checks for a function call named "greet." If neither fits, the test will let you know by failing.
Feel free to ask if you want me to walk you through it more or if you'd like some extra examples!ancient-hospital-67205
08/05/2025, 6:28 PMtall-airline-16756
08/05/2025, 6:28 PMnext_event() in chained assertions, it actually moves the cursor forward to the next event. That means if your first scenario doesn’t pass and you move on to the second one, the cursor might have already skipped the event you wanted to check.
To get around this, you can use indexed access instead of next_event(). This way, you can directly point to specific events without moving the cursor. Here’s a little example:
# Test for either a message or a function call without advancing the cursor
@pytest.mark.asyncio
async def test_agent_response():
async with (
openai.LLM(model="gpt-4o-mini") as llm,
AgentSession(llm=llm) as session,
):
await session.start(Assistant())
result = await session.run(user_input="Hello")
# Check the first event for either a message or a function call
first_event = result.expect[0]
try:
first_event.is_message(role="assistant")
except AssertionError:
first_event.is_function_call(name="greet")
This way, you’re making sure you’re checking the same event for both conditions, without accidentally moving past it.
If you want me to walk you through it more or need some extra examples, just let me know—I’m happy to help!ancient-hospital-67205
08/05/2025, 9:25 PMtall-airline-16756
08/05/2025, 9:25 PMtool_bench.py script. It has a handy -n option where you can decide how many times each test runs (it defaults to 10). The script keeps track of both successes and failures, so you can easily see the failure rates across all those runs.
If you want, I’m here to help you get it set up or explain anything else about using the script!
Sources: tools/llama-bench/README.mdancient-hospital-67205
08/05/2025, 10:07 PMtall-airline-16756
08/05/2025, 10:08 PMtest_completion.py tests, where using the same seed helps ensure you get consistent results every time with the same inputs and settings. Plus, the Google Gemini LLM integration also supports a seed parameter to keep things predictable.
If you'd like, I’m here to help you set this up—just let me know!
Sources: tools/server/tests/unit/test_completion.py | livekit-plugins/livekit-plugins-anthropic/README.md