Skip to content

feat(testing): add replay backed provider support for DSL runs#1581

Open
AdityaShome wants to merge 20 commits intomofa-org:mainfrom
AdityaShome:testing-replay-support
Open

feat(testing): add replay backed provider support for DSL runs#1581
AdityaShome wants to merge 20 commits intomofa-org:mainfrom
AdityaShome:testing-replay-support

Conversation

@AdityaShome
Copy link
Copy Markdown
Contributor

Summary

This PR is rebased on #1566, #1558, #1556, #1555 and #1447 adds replay backed provider support for DSL test runs.

The core change is that mofa test-dsl can now:

  • call a live OpenAI and Llama compatible provider
  • record the interaction to a tape file
  • replay that tape deterministically in later runs

This closes the replay part of the DSL / runner / artifact / replay loop.

Context

Previous work established:

  • a DSL entrypoint through mofa test-dsl
  • canonical run artifacts
  • baseline comparison / comparison output
  • CI oriented mismatch handling

What was still missing was a provider path that could:

  • record real LLM responses from a live provider
  • persist them in a reusable format
  • replay them without network access

This PR adds that missing layer.

What Changed

  • Added a live OpenAI-compatible provider implementation for DSL-backed recording runs
  • Added replay tape types and replay provider plumbing
  • Added recording provider plumbing that writes tape interactions to disk
  • Integrated recording and replay provider selection into the DSL runner
  • Added replay-oriented tests
  • Added example DSL cases for hosted recording and tape replay

Files Changed

Main Files

tests/src/live_llm.rs

  • Added OpenAiCompatProvider
  • Converts kernel requests into OpenAI-compatible chat payloads
  • Executes live chat-completions requests
  • Converts responses back into kernel response types

tests/src/replay.rs

  • Added tape models for recorded interactions
  • Added RecordingLLMProvider
  • Added ReplayLLMProvider
  • Added tape read / write helpers for deterministic replay

tests/src/dsl.rs

  • Extended DSL parsing and runner construction to support:
    • live OpenAI-compatible providers
    • tape recording
    • tape replay
  • Added provider config loading and API key resolution

tests/src/agent_runner.rs

  • Added runner constructors for:
    • live recording providers
    • replay-backed providers
  • Wired recording / replay handles into the existing harness

crates/mofa-cli/tests/test_dsl_integration_tests.rs

  • Added CLI-level coverage around the replay-enabled DSL execution path

Supporting Files

  • crates/mofa-cli/Cargo.toml: Added the testing crate dependency needed for replay-backed DSL execution
  • tests/Cargo.toml: Added dependencies required for live-provider requests and replay support
  • tests/src/lib.rs: Exported the new replay/live-provider pieces from the testing crate
  • tests/tests/dsl_tests.rs: Added DSL tests covering live-provider and replay behavior
  • tests/examples/simple_agent_tape.toml: Added a replay-only DSL case that reads a recorded tape
  • tests/examples/simple_agent_llama_record.toml: Added a hosted llama recording example through the OpenAI-compatible path
  • tests/examples/record_case_gpt_oss_120b.toml: Added a GPT-OSS 120B recording example through the same path
  • tests/fixtures/simple_agent_recorded.tape.json: Added a recorded tape fixture used by replay flows

Execution Flow

image ## Example Commands

Record a live run:

cargo run -p mofa-cli -- test-dsl tests/examples/simple_agent_llama_record.toml --artifact-out /tmp/simple_agent_llama_record_artifact.json --output json

Replay a recorded run:

cargo run -p mofa-cli -- test-dsl tests/examples/simple_agent_tape.toml

Screenshot

Screenshot from 2026-04-04 23-50-20

Tests

  • cargo test -p mofa-testing --test dsl_tests
  • cargo test -p mofa-cli --test test_dsl_integration_tests
  • Manual recording run against an OpenAI-compatible provider
  • Manual replay run using tests/examples/simple_agent_tape.toml

Notes

  • The main value of this PR is replay support
  • The examples exist to exercise and demonstrate the new recording/replay path
  • Provider secrets remain local and are not part of the committed changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant