specis the living contract.sessionis the durable evidence ledger.handoffis transport, not source of truth.reviewis the adversarial completion gate.
You execute autonomously inside the contract. You do not close the task unchallenged.
scafld init
scafld plan <task-id> --title "Title" --size small --risk low
scafld harden <task-id>
scafld harden <task-id> --mark-passed
scafld validate <task-id>
scafld approve <task-id>
scafld build <task-id>
scafld review <task-id>
scafld complete <task-id>
scafld status <task-id>
scafld list
scafld report
scafld handoff <task-id>
scafld updateFor real review: scafld review <task-id> --provider {codex|claude|command}.
--provider local is smoke-test only and cannot satisfy complete.
Only an operator may use scafld review <task-id> --human-reviewed --reason ....
Inside the scafld repo, use ./bin/scafld or go run ./cmd/scafld. Do not use
a copied compiled binary; stale binaries can report old lifecycle state.
plan -> harden -> approve -> build -> review -> complete
Hardening attacks the draft. Review attacks the result.
Build opens one phase at a time. After implementing the opened phase, run
scafld build <task-id> again to record evidence and advance.
- Edit outside declared scope, objectives, or invariants.
- Reconstruct lifecycle state by scraping Markdown. Use
status --json. - Mutate
.scafld/core/by hand. Usescafld update. - Run
--provider localfor real review. - Cite files, commands, or review findings you have not verified.
.scafld/prompts/* overrides .scafld/core/prompts/* overrides built-ins.
Canonical reference for AI coding agents working in the runx OSS workspace.
This repo uses scafld for non-trivial work, but the architecture rules are the
runx rules in CONVENTIONS.md, docs/rust-kernel-architecture.md, and
docs/trusted-kernel-package-truth.md.
Key files:
.scafld/config.yaml- Validation rules, rubric weights, safety controls, profiles.scafld/prompts/plan.md- Planning mode prompt.scafld/prompts/exec.md- Execution mode prompt.scafld/core/schemas/spec.json- Spec validation schemaCONVENTIONS.md- Coding standards and patterns
Spec-driven development: every non-trivial task becomes a machine-readable markdown specification before any code changes happen.
- Plan - Analyze task, explore codebase, generate spec in
.scafld/specs/drafts/ - Review - Human reviews and approves the spec
- Build - Agent executes approved spec with validation
- Complete - Completed specs are marked through the scafld lifecycle
The spec is the contract. Operate autonomously within its bounds; pause for approval on deviations.
For detailed planning instructions, read .scafld/prompts/plan.md. For execution, read .scafld/prompts/exec.md.
draft → approved → review → completed
↓ ↓ ↓
(edit) failed cancelled
Valid transitions:
draft→approved→review→completed- active work can move to
failedorcancelled - blocked work must be recorded in the spec state and handoff
These rules must not be violated. See config.yaml for the canonical invariant list.
Rust owns trusted local execution, receipt sealing, runtime policy, harness replay, MCP, payment gates, and sandbox planning. TypeScript packages may wrap or present those paths, but must not reintroduce local execution fallback logic.
Pure crates and packages stay pure. runx-core, runx-contracts,
runx-parser, and runx-receipts must not import filesystem, network,
subprocess, CLI, adapter, or runtime concerns.
Public contract changes require a clean cutover through Rust-owned schemas and
fixtures. Do not add compatibility aliases, .v2 ids, or dual-read runtime
shims for governed wire shapes.
Official skills that drive stateful hosted apps emit generic effect transition
packets. Put product identity in effect_family and the runner/action in
operation; do not add product-specific AuthorityResourceFamily variants or
runx.<product>.* packet namespaces. Stateful app memory belongs in the hosted
stateful-effect substrate and its declared reducers/views, not in OSS core
enums or bespoke runtime branches.
No dual-reads, dual-writes, or runtime fallbacks. When changing schemas or identifiers, adopt the new scheme immediately. Use one-off migration scripts, not runtime code.
Long-running agent workflows are loops over governed turns, not resident kernel
loops. The loop host lives in an app, hosted service, local script, or external
orchestrator. It owns scheduling, durable loop state, wakeups, projections, and
stop policy. A runx turn is one skill or graph run with explicit inputs,
authority, allowed_tools, optional context_skills, bounded model/tool
rounds, approval gates, and one sealed receipt.
Handoffs are receipt-backed artifacts or tool-shaped results. Prior receipts and
skill context are untrusted data for the next turn, not new authority. Do not add
loop-specific authority families, packet namespaces, product branches, or
schedulers to runx-core; build residency outside the kernel over ordinary runx
submissions.
Configuration from environment or secrets management, never hardcoded. No secrets in code, logs, or diffs.
No test fixtures, mocks, or conditional test-only logic in production code. Test utilities stay in dedicated test helper modules.
Always use the scafld CLI for spec lifecycle management. Never manually move, copy, or rename spec files between directories. Never manually change the status field. The CLI enforces validation, state transitions, and the review gate — bypassing it breaks the audit trail.
- When: Starting a new task, exploring requirements
- Actions: Search, read, analyze (NO code changes outside
.scafld/specs/) - Output: Markdown spec in
.scafld/specs/drafts/with statusdraft - Prompt: Read
.scafld/prompts/plan.mdbefore entering this mode
- When: Spec has status
approved - Actions: Apply changes, run acceptance criteria, record scafld build evidence
- Output: Code changes, validation results, updated spec
- Prompt: Read
.scafld/prompts/exec.mdbefore entering this mode - Autonomy: Execute all phases without pausing unless blocked, deviating from spec, or hitting a destructive action not covered by spec
For trivial changes (typos, single-line fixes), skip the spec workflow and work directly.
- When: Build has passed and status is
review - Actions: Run
scafld review, thenscafld completeonly after the native review gate passes - Output: Review verdict recorded in the spec and available through
scafld status/scafld handoff - Prompt: Read
.scafld/prompts/review.mdbefore entering this mode - Mandate: Find problems, not confirm success. A review that finds zero issues still needs grounded evidence from the changed files, validation commands, and spec scope.
Validation profiles (light, standard, strict) and their check pipelines are defined in config.yaml. Agents select a profile based on task.acceptance.validation_profile or derive from task.risk_level (low→light, medium→standard, high→strict).
Per-phase: Run configured checks after each phase completes.
Pre-commit: Run full validation pipeline before marking task complete.
Self-evaluation: Score work on rubric (defined in config.yaml). Threshold is 7/10; perform second pass if below.
Defined in config.yaml under safety. Key rules:
Require approval for: Schema migrations, public API changes, data deletion, production deployments.
Automatically prevent: Hardcoded secrets, unbounded queries, SQL injection, XSS vulnerabilities.
See CONVENTIONS.md for full coding standards. Key points:
- Match existing code style; keep diffs focused
- Prefer existing helpers; keep code DRY
- Explicit named imports, no confusing aliases
- Clear module ownership; split mixed responsibility files when boundaries are already visible in the code
- Idempotent one-off migrations executed out of band, never hidden runtime compatibility paths
Only commit when explicitly asked by the user.
Format: type(scope): title (conventional commits)
Types: feat, fix, refactor, docs, test, chore, perf, style
Rules:
- One logical change per commit
- Title under 72 characters
- Include what changed and why in the body
- No unrelated edits bundled together
- Pre-commit: code builds, tests pass, no secrets in diff, no debug code
Progress updates: Report phase completion, acceptance criteria pass/fail counts, next action. Keep it concise - no verbose preambles.
When blocked: State what's blocked, brief error, one recommendation, resolution options.
Final summary: Phases completed, acceptance results, self-evaluation score, deviations, files changed.
| Path | Purpose |
|---|---|
.scafld/config.yaml |
Validation, rubric, safety, profiles |
.scafld/prompts/plan.md |
Planning mode instructions |
.scafld/prompts/exec.md |
Execution mode instructions |
.scafld/prompts/review.md |
Adversarial review mode instructions |
.scafld/core/schemas/spec.json |
Spec JSON schema |
.scafld/specs/ |
Task specs by lifecycle status |
.scafld/runs/ |
Session ledger, diagnostics, and handoffs |
CONVENTIONS.md |
Coding standards |
# CLI (manages status, validation, file moves)
scafld plan <task-id> # scaffold a markdown spec in drafts/
scafld list # show all specs
scafld status <task-id> # show details + phase progress
scafld validate <task-id> # check against schema
scafld approve <task-id> # approve the draft spec
scafld build <task-id> # run validation and move to review when checks pass
scafld exec <task-id> # execute configured task actions when used
scafld review <task-id> # run native review provider
scafld complete <task-id> # record review verdict and complete the task
scafld handoff <task-id> # render markdown handoff
scafld fail <task-id> # mark failed
scafld cancel <task-id> # mark cancelled
scafld report # aggregate stats across all specs