Skip to content

vivekkrishna/agentic-validation-skills

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentic Validation Skills

Guidelines and skills for agentic AI test automation, based on the CIGE standard — a structured test case format designed for self-healing, validation-driven agentic test automation.

A CLAUDE.md file and Claude Code skill that improve how AI agents author, execute, and recover from agentic test cases.

The Problem

Traditional test formats (BDD, POM, AAA) were designed for deterministic systems with rigid step sequences. When an AI agent runs these tests:

  • Intent collapses into procedures — when the UI changes, the whole test breaks even if the goal hasn't changed
  • No guardrails — agents optimize for test completion, not safe test completion
  • Failure is binary — pass/fail with no classification, so agents can't decide whether to self-heal, escalate, or retry
  • Context is assumed — agents explore blindly instead of operating within a defined decision space
  • Token waste — loading the full test upfront dilutes attention and increases cost

The Solution: CIGE

The CIGE standard separates four concerns that traditional test formats conflate:

Component Role Stability
Context System under test, environment, tools, preconditions Changes with environment
Intent The single outcome the test must confirm Stable — survives system changes
Guardrails Explicit constraints on agent behavior Stable — safety boundaries
Execution Adaptive guidance for reaching the goal Mutable — self-heals with the system

Every agentic test case is expressed as:

{
  "Context": "E-commerce checkout service, staging environment, user pre-authenticated with test account, payment gateway mocked",
  "Intent": "User can successfully place an order and receive an order confirmation",
  "Guardrails": [
    "Do not submit orders with real payment credentials",
    "Do not mutate production order records",
    "Halt if checkout flow exits the staging domain"
  ],
  "Execution": [
    "Navigate to cart with at least one item",
    "Proceed through checkout flow to order summary",
    "Submit order using test payment credentials",
    "Verify: order confirmation page is displayed with a valid order ID"
  ]
}

The Four Principles

1. Establish Context Before Testing

Narrow the agent's decision space. Never test into the void.

Declare system, environment, tools, and state before writing a single execution step. Missing context causes wasted exploration and flaky results.

2. Anchor Every Test to Intent

Intent is the stable core. Never collapse it into steps.

Write intent as one outcome-focused sentence. It must survive UI changes, infrastructure migrations, and workflow refactors. The question is always: "Did the agent safely achieve the intended outcome?" — not "Did it follow every step?"

3. Define Guardrails Before Executing

Bounded autonomy is safe autonomy. No guardrails = no execution.

Declare what the agent must never do before any test runs. Guardrails are constraints, not suggestions. An agent that violates a guardrail to complete a test has not passed the test.

4. Treat Execution as Adaptive Guidance

Steps are starting points. Goal fidelity is the finish line.

Execution steps are guidance, not scripts. When the system changes, execution self-heals. But only after classifying the failure:

Failure Type Recovery
Product Defect Escalate — preserve intent
Outdated Test Logic Update execution only — intent must not change
Infrastructure Failure Restore context — do not alter test definition

Never rewrite intent to make a failing test pass.

Reported Impact

From production use of the CIGE standard:

  • ~90% reduction in test maintenance effort
  • ~38% reduction in false positives
  • ~62% reduction in average test execution time

Install

Option A: Claude Code Plugin (recommended)

/plugin marketplace add vivekkrishna/agentic-validation-skills
/plugin install agentic-validation-skills

Once installed, invoke the skill with:

/agentic-validation-skills:cige-guidelines

Option B: CLAUDE.md (per-project)

New project:

curl -o CLAUDE.md https://raw.githubusercontent.com/vivekkrishna/agentic-validation-skills/main/CLAUDE.md

Existing project (append):

echo "" >> CLAUDE.md
curl https://raw.githubusercontent.com/vivekkrishna/agentic-validation-skills/main/CLAUDE.md >> CLAUDE.md

How to Know It's Working

These guidelines are working if you see:

  • Tests survive system changes — execution updates but intent never rewrites
  • Failures are classified, not just reported — agents know whether to self-heal, escalate, or retry
  • Guardrail violations surface before execution — not discovered mid-run
  • Context is always explicit — no blind exploration, no assumed state
  • Shorter, more focused test runs — progressive context disclosure reduces token waste

Customization

Merge CLAUDE.md with your project-specific test guidelines:

## Project-Specific Test Context

- Staging URL: https://staging.example.com
- Test account credentials are in Vault at secret/test-accounts
- Payment gateway mock is enabled by default in staging
- Do not test against the orders-v1 endpoint — deprecated, use orders-v2

Background

CIGE was proposed as an alternative to free-form agentic test definitions. The full rationale is in the original article: CIGE: An Agentic AI Test Case Standard Proposed.

The core insight: modern AI agents operate contextually, not mechanically — just like human testers. Test definitions should reflect that. Rigid step sequences served deterministic automation. Structured, intent-driven formats serve agentic automation.

License

MIT

About

Agentic test automation skills for test case authoring, context, intent, self healing, failure classification, guardrails and progressive context disclosure

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors