-
Notifications
You must be signed in to change notification settings - Fork 1
Configuration
Varun Pratap Bhardwaj edited this page Mar 6, 2026
·
1 revision
AgentAssay uses YAML configuration files for agent definitions, test scenarios, and assay parameters.
| File Type | Purpose | Example |
|---|---|---|
| AssayConfig | Test run parameters (trials, alpha, power) | assay-config.yaml |
| AgentConfig | Agent metadata (id, framework, model, version) | agent-config.yaml |
| TestScenario | Scenario definition (input, expected properties) | scenario-booking.yaml |
Defines test run parameters.
num_trials: 50
significance_level: 0.05
power: 0.80
budget_mode: adaptive
early_stopping: true| Field | Type | Default | Description |
|---|---|---|---|
num_trials |
int | 30 | Number of trials per scenario |
significance_level |
float | 0.05 | Type I error rate (alpha) |
power |
float | 0.80 | Statistical power (1 - beta) |
budget_mode |
str | fixed |
fixed or adaptive
|
early_stopping |
bool | false | Enable SPRT early stopping |
min_trials |
int | 10 | Minimum trials (for adaptive/SPRT) |
max_trials |
int | 200 | Maximum trials (for adaptive/SPRT) |
-
fixed: Run exactlynum_trialsfor every scenario -
adaptive: Compute minimum N based on calibration variance
Defines agent metadata.
agent_id: booking-agent-v2
name: Booking Agent
description: Books flights, hotels, and car rentals
framework: crewai
model: gpt-4o
version: 2.1.0
metadata:
team: booking-squad
owner: alice@example.com| Field | Type | Required | Description |
|---|---|---|---|
agent_id |
str | Yes | Unique identifier |
name |
str | Yes | Human-readable name |
description |
str | No | Agent description |
framework |
str | Yes | Framework name (langgraph, crewai, custom, etc.) |
model |
str | Yes | Primary model (gpt-4o, claude-opus-4-6, etc.) |
version |
str | No | Agent version |
metadata |
dict | No | Arbitrary key-value pairs |
Defines a test scenario.
scenario_id: book-flight-nyc
name: Book a flight to NYC
description: User requests to book a round-trip flight to New York City
input_data:
prompt: "Book a round-trip flight to NYC departing March 15, returning March 22"
user_id: "test-user-123"
expected_properties:
max_steps: 15
max_cost_usd: 0.50
max_duration_ms: 5000
required_tools:
- search_flights
- book_flight
- confirm_booking
evaluation:
threshold: 0.85
confidence: 0.95| Field | Type | Required | Description |
|---|---|---|---|
scenario_id |
str | Yes | Unique scenario identifier |
name |
str | Yes | Human-readable name |
description |
str | No | Scenario description |
input_data |
dict | Yes | Input to the agent |
expected_properties |
dict | No | Constraints (max_steps, max_cost_usd, etc.) |
evaluation.threshold |
float | No | Pass rate threshold (default: 0.80) |
evaluation.confidence |
float | No | Confidence level (default: 0.95) |
| Property | Type | Description |
|---|---|---|
max_steps |
int | Maximum number of steps allowed |
max_cost_usd |
float | Maximum API cost allowed |
max_duration_ms |
float | Maximum execution time allowed |
required_tools |
list[str] | Tools that must be called |
forbidden_tools |
list[str] | Tools that must NOT be called |
min_llm_calls |
int | Minimum number of LLM calls |
max_llm_calls |
int | Maximum number of LLM calls |
Define multiple scenarios in one file:
scenarios:
- scenario_id: book-flight
name: Book a flight
input_data:
prompt: "Book a flight to NYC on March 15"
expected_properties:
max_cost_usd: 0.50
evaluation:
threshold: 0.85
- scenario_id: cancel-flight
name: Cancel a flight
input_data:
prompt: "Cancel my flight to NYC"
expected_properties:
max_steps: 8
evaluation:
threshold: 0.90
- scenario_id: change-flight
name: Change flight date
input_data:
prompt: "Change my flight to March 20"
expected_properties:
max_cost_usd: 0.30
evaluation:
threshold: 0.80agentassay run --config assay-config.yaml --scenario suite-booking.yamlfrom agentassay.core.models import AssayConfig, AgentConfig, TestScenario
import yaml
# Load AssayConfig
with open("assay-config.yaml") as f:
assay_config = AssayConfig(**yaml.safe_load(f))
# Load AgentConfig
with open("agent-config.yaml") as f:
agent_config = AgentConfig(**yaml.safe_load(f))
# Load TestScenario
with open("scenario-booking.yaml") as f:
scenario = TestScenario(**yaml.safe_load(f))Override config values via environment:
export AGENTASSAY_NUM_TRIALS=100
export AGENTASSAY_ALPHA=0.01
export AGENTASSAY_POWER=0.90
agentassay run --config assay-config.yaml --scenario scenarios.yamlPrecedence: CLI args > Environment > Config file > Defaults
tests/
configs/
assay-ci.yaml # CI/CD config (strict)
assay-dev.yaml # Dev config (fast feedback)
scenarios/
booking/
suite-booking.yaml
cancellation/
suite-cancel.yaml
agents/
booking-agent-v2.yaml
Create base configs and extend them:
Base: assay-base.yaml
significance_level: 0.05
power: 0.80
budget_mode: adaptiveCI: assay-ci.yaml
extends: assay-base.yaml
num_trials: 50
early_stopping: trueAdd comments to scenario files:
expected_properties:
max_steps: 15 # Typical: 8-12, max observed: 14
max_cost_usd: 0.50 # Typical: $0.20-$0.30
required_tools:
- search_flights # Must search before booking
- book_flight # Core actionconfig_version: 1.0.0
agent_id: booking-agent-v2
# ...- CLI Reference — Use config files with CLI
- Pytest Plugin — Use config in tests
- CI/CD Integration — Config for deployment gates
Part of Qualixar | Author: Varun Pratap Bhardwaj
Getting Started
Core Concepts
- Token-Efficient Testing
- Behavioral Fingerprinting
- Statistical Methods
- Coverage Model
- Mutation Testing
Guides
Reference