Skip to content

feat(testing): add structured comparison output for test-dsl#1566

Open
AdityaShome wants to merge 17 commits intomofa-org:mainfrom
AdityaShome:testing-comparison-output
Open

feat(testing): add structured comparison output for test-dsl#1566
AdityaShome wants to merge 17 commits intomofa-org:mainfrom
AdityaShome:testing-comparison-output

Conversation

@AdityaShome
Copy link
Copy Markdown
Contributor

Summary

This PR is rebased on #1558, #1556, #1555 and #1447, adds machine‑readable comparison output and a CI gate flag for DSL baseline diffs.

mofa test-dsl can now emit a JSON comparison artifact via --comparison-out, enabling CI and tooling to consume the baseline diff in a stable, structured form.

Context

Baseline comparison already exists and can report match/mismatch in CLI output, but there was no structured output for CI gates or downstream tooling.

What Changed

This PR closes that gap by:

  • Added AgentRunArtifactComparison (serializable comparison output type)
  • Added --comparison-out <file> to mofa test-dsl
  • Added --fail-on-diff to return a non-zero exit on baseline mismatches
  • Wrote comparison JSON to disk when --comparison-out is provided
  • Added CLI parse coverage for the new flag

Files Changed

Core Files

tests/src/artifact.rs

  • Added AgentRunArtifactComparison.
  • Added a constructor to build comparison output from baseline + candidate artifacts.

tests/src/lib.rs

  • Re-exported the comparison output type.

crates/mofa-cli/src/cli.rs

  • Added --comparison-out flag to test-dsl.
  • Added clap parse test for the new flag.

crates/mofa-cli/src/commands/test_dsl.rs

  • Loads baseline artifact when provided.
  • Builds AgentRunArtifactComparison using existing diff output.
  • Writes comparison JSON to --comparison-out.
  • Errors if --comparison-out is used without --baseline-in.
  • Returns a non-zero exit when --fail-on-diff is set and the baseline mismatches.

crates/mofa-cli/src/main.rs

  • Wired comparison_out through CLI dispatch.

CLI Usage

mofa test-dsl tests/examples/simple_agent.toml \
  --baseline-in dsl-baseline.json \
  --comparison-out dsl-comparison.json
mofa test-dsl tests/examples/simple_agent.toml \
  --baseline-in dsl-baseline.json \
  --fail-on-diff

Example Comparison Output

{
  "case_name": "simple_agent_run",
  "matches": false,
  "differences": [
    {
      "field": "output_text",
      "expected": "forced mismatch",
      "actual": "hello from DSL"
    }
  ],
  "baseline_execution_id": "019d4e06-f888-7a43-9ef3-edc0603b5dcd",
  "candidate_execution_id": "019d4e0b-806c-7393-b3f6-53f0b5a9efc9"
}

Execution Flow

 flowchart TD
      A[TOML DSL file] --> B[mofa test-dsl]
      B --> C[Parse TestCaseDsl]
      C --> D[Execute case]
      D --> E[Build AgentRunArtifact]
      E --> F[Write artifact<br/>--artifact-out]
      E --> G[Write baseline<br/>--baseline-out]
      E --> H[Load baseline<br/>--baseline-in]
      H --> I[Compare artifacts]
      I --> J[Write comparison JSON<br/>--comparison-out]
      E --> K[Build report]
      K --> L[Write report<br/>--report-out]
      I --> M{--fail-on-diff?}
      M -->|matches| N[Exit 0]
      M -->|mismatch| O[Exit 1]
Loading

Tests

  • cargo test -p mofa-cli test_dsl

Notes

This is the CI oriented, machine readable layer on top of the existing baseline diff model. It does not change diff semantics, only the output shape and CLI wiring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant