Skip to content

Configuration

Varun Pratap Bhardwaj edited this page Mar 6, 2026 · 1 revision

Configuration

AgentAssay uses YAML configuration files for agent definitions, test scenarios, and assay parameters.

Config File Types

File Type Purpose Example
AssayConfig Test run parameters (trials, alpha, power) assay-config.yaml
AgentConfig Agent metadata (id, framework, model, version) agent-config.yaml
TestScenario Scenario definition (input, expected properties) scenario-booking.yaml

AssayConfig

Defines test run parameters.

Example: assay-config.yaml

num_trials: 50
significance_level: 0.05
power: 0.80
budget_mode: adaptive
early_stopping: true

Fields

Field Type Default Description
num_trials int 30 Number of trials per scenario
significance_level float 0.05 Type I error rate (alpha)
power float 0.80 Statistical power (1 - beta)
budget_mode str fixed fixed or adaptive
early_stopping bool false Enable SPRT early stopping
min_trials int 10 Minimum trials (for adaptive/SPRT)
max_trials int 200 Maximum trials (for adaptive/SPRT)

Budget Modes

  • fixed: Run exactly num_trials for every scenario
  • adaptive: Compute minimum N based on calibration variance

AgentConfig

Defines agent metadata.

Example: agent-config.yaml

agent_id: booking-agent-v2
name: Booking Agent
description: Books flights, hotels, and car rentals
framework: crewai
model: gpt-4o
version: 2.1.0
metadata:
  team: booking-squad
  owner: alice@example.com

Fields

Field Type Required Description
agent_id str Yes Unique identifier
name str Yes Human-readable name
description str No Agent description
framework str Yes Framework name (langgraph, crewai, custom, etc.)
model str Yes Primary model (gpt-4o, claude-opus-4-6, etc.)
version str No Agent version
metadata dict No Arbitrary key-value pairs

TestScenario

Defines a test scenario.

Example: scenario-booking.yaml

scenario_id: book-flight-nyc
name: Book a flight to NYC
description: User requests to book a round-trip flight to New York City
input_data:
  prompt: "Book a round-trip flight to NYC departing March 15, returning March 22"
  user_id: "test-user-123"
expected_properties:
  max_steps: 15
  max_cost_usd: 0.50
  max_duration_ms: 5000
  required_tools:
    - search_flights
    - book_flight
    - confirm_booking
evaluation:
  threshold: 0.85
  confidence: 0.95

Fields

Field Type Required Description
scenario_id str Yes Unique scenario identifier
name str Yes Human-readable name
description str No Scenario description
input_data dict Yes Input to the agent
expected_properties dict No Constraints (max_steps, max_cost_usd, etc.)
evaluation.threshold float No Pass rate threshold (default: 0.80)
evaluation.confidence float No Confidence level (default: 0.95)

Expected Properties

Property Type Description
max_steps int Maximum number of steps allowed
max_cost_usd float Maximum API cost allowed
max_duration_ms float Maximum execution time allowed
required_tools list[str] Tools that must be called
forbidden_tools list[str] Tools that must NOT be called
min_llm_calls int Minimum number of LLM calls
max_llm_calls int Maximum number of LLM calls

Multi-Scenario Suite

Define multiple scenarios in one file:

Example: suite-booking.yaml

scenarios:
  - scenario_id: book-flight
    name: Book a flight
    input_data:
      prompt: "Book a flight to NYC on March 15"
    expected_properties:
      max_cost_usd: 0.50
    evaluation:
      threshold: 0.85

  - scenario_id: cancel-flight
    name: Cancel a flight
    input_data:
      prompt: "Cancel my flight to NYC"
    expected_properties:
      max_steps: 8
    evaluation:
      threshold: 0.90

  - scenario_id: change-flight
    name: Change flight date
    input_data:
      prompt: "Change my flight to March 20"
    expected_properties:
      max_cost_usd: 0.30
    evaluation:
      threshold: 0.80

Loading Config Files

From CLI

agentassay run --config assay-config.yaml --scenario suite-booking.yaml

From Python

from agentassay.core.models import AssayConfig, AgentConfig, TestScenario
import yaml

# Load AssayConfig
with open("assay-config.yaml") as f:
    assay_config = AssayConfig(**yaml.safe_load(f))

# Load AgentConfig
with open("agent-config.yaml") as f:
    agent_config = AgentConfig(**yaml.safe_load(f))

# Load TestScenario
with open("scenario-booking.yaml") as f:
    scenario = TestScenario(**yaml.safe_load(f))

Environment Variables

Override config values via environment:

export AGENTASSAY_NUM_TRIALS=100
export AGENTASSAY_ALPHA=0.01
export AGENTASSAY_POWER=0.90

agentassay run --config assay-config.yaml --scenario scenarios.yaml

Precedence: CLI args > Environment > Config file > Defaults

Best Practices

1. Store Configs in Repo

tests/
  configs/
    assay-ci.yaml          # CI/CD config (strict)
    assay-dev.yaml         # Dev config (fast feedback)
  scenarios/
    booking/
      suite-booking.yaml
    cancellation/
      suite-cancel.yaml
  agents/
    booking-agent-v2.yaml

2. Use Templates

Create base configs and extend them:

Base: assay-base.yaml

significance_level: 0.05
power: 0.80
budget_mode: adaptive

CI: assay-ci.yaml

extends: assay-base.yaml
num_trials: 50
early_stopping: true

3. Document Expected Properties

Add comments to scenario files:

expected_properties:
  max_steps: 15          # Typical: 8-12, max observed: 14
  max_cost_usd: 0.50     # Typical: $0.20-$0.30
  required_tools:
    - search_flights     # Must search before booking
    - book_flight        # Core action

4. Version Your Configs

config_version: 1.0.0
agent_id: booking-agent-v2
# ...

Next Steps


Part of Qualixar | Author: Varun Pratap Bhardwaj

Clone this wiki locally