TraceCore v1.0 is the normative specification for deterministic agent execution. It supersedes v0.1 and promotes all provisional language to normative requirements. It defines the contracts that any conforming runtime, harness, or tooling MUST satisfy in order to emit TraceCore-compliant artifacts. This document is intentionally formal; all explanatory or marketing content lives outside /spec/.
- Establish a deterministic execution standard for autonomous agent systems.
- Enable independent teams to build interoperable runtimes and tooling that exchange TraceCore artifacts without loss of fidelity.
- Provide auditors with a portable evidence format that can be validated offline.
- Guarantee that any artifact claiming TraceCore compliance carries sufficient metadata for independent replay and budget verification.
- Episode — A bounded execution of an agent against a task, parameterized by
agent_ref,task_ref,seed,budgets, andruntime_identity. - Task — A deterministic harness defined by
setup.py,actions.py,validate.py, andtask.toml, including sandbox allowlists and budgets. - Budget — The set of quantitative limits (
steps,tool_calls, optionalwall_clock_seconds) that guarantee termination. - Validator — Deterministic code that inspects task state and emits binary verdicts and structured payloads used to classify termination.
- Artifact — The canonical JSON record of an episode, conforming to
/spec/artifact-schema-v1.0.json. - Batch — A collection of episodes executed under a bounded parallel worker pool with shared timeout and spec enforcement policies.
Every TraceCore episode MUST execute the following ordered stages. Deviations are non-compliant.
- Resolve identities — Freeze
agent_ref,task_ref,task_version,seed,budgets, andspec_version. These inputs are immutable for the duration of the episode. - Load deterministic task — Materialize the task harness described by
task_ref. Hashes of task files MUST be recorded in the artifact (task_hash). - Initialize agent — Call the reference interface with the immutable task specification and budgets.
- Interact — Repeat the observe → act → execute loop while budgets remain. Observations MUST include remaining budget deltas.
- Validate — Invoke the validator after each action (inline) or at declared checkpoints. Validators MUST emit
{ "ok": bool, "terminal": bool, "details": object }. - Terminate — Emit
termination_reasonandfailure_typedrawn from the canonical taxonomy defined indocs/core.md. - Compute timing — Record
wall_clock_elapsed_sas the duration from episode start to artifact finalization. - Persist artifact — Write the artifact atomically and compute its content hash (
artifact_hash) for integrity verification.
- Seeded determinism — Given identical inputs (agent code, task version, seed, budgets, runtime identity) the runtime MUST emit the same sequence of actions, observations, and validator decisions.
- Replay determinism — Implementations MUST provide a means to rerun an artifact (
--replay, bundle verification) and produce identical outcomes or a structured incompatibility reason. - Tool mocking — Any external IO (network, filesystem outside allowlists) MUST be mocked or blocked so traces remain reproducible.
- Model version pinning — LLM or model dependencies MUST record provider, model identifier, and shim version in the artifact. Non-pinned models are non-compliant.
- Clock control — Timestamps MUST be UTC ISO8601.
wall_clock_elapsed_sMUST be computed as the difference betweencompleted_atandstarted_at. - Non-compliance — Detectable nondeterminism (e.g., diverging traces across replays) MUST be surfaced as
failure_type=non_deterministicor equivalent rejection. - Parallel isolation — When running episodes in batch mode, each episode MUST execute in an isolated process context; state MUST NOT leak between workers.
- Artifacts MUST conform to
/spec/artifact-schema-v1.0.json. - Every artifact MUST include
spec_version,task_hash,agent_ref,runtime_identity,wall_clock_elapsed_s, and the fullaction_trace. - Each trace entry MUST record observation, action, result, IO audit, and budget deltas.
- Validator payloads MUST be embedded verbatim.
- Artifacts MUST embed a deterministic
artifact_hashcomputed as SHA-256 over the stable (non-volatile) serialized payload. - Artifacts MUST be immutable once written; updates require a new artifact with a distinct
run_id. wall_clock_elapsed_sMUST be a non-negative float representing total episode wall time in seconds.
- Batch runs MUST accept a configurable
--workersbound (default: min(cpu_count, 8)). - Each worker process MUST be isolated (spawn context; no shared state from parent).
- Per-job timeout enforcement MUST produce
failure_type=timeoutartifacts rather than silent drops. - Batch results MUST include aggregate statistics: total, passed, failed, P50/P95 wall-clock.
--strict-specMUST propagate to all workers in a batch run.
An implementation is TraceCore v1.0 compliant if and only if it:
- Executes episodes according to the lifecycle defined above.
- Enforces budgets strictly—no action may execute once any budget reaches zero.
- Emits artifacts that validate against
artifact-schema-v1.0.json. - Records deterministic identifiers (
task_hash,agent_ref,spec_version,wall_clock_elapsed_s). - Declares determinism guarantees per
/spec/determinism.md. - Provides a compliance flag (
--strict-spec) that validates artifacts before reporting success. - Documents any optional extensions; extensions MUST NOT alter required fields or semantics.
- Supports batch parallel execution with process isolation and aggregate reporting.
See /spec/compliance-checklist-v0.1.md for an auditable checklist (updated for v1.0 in next minor).
- Defining agent cognition strategies, prompt templates, or reasoning stacks.
- Mandating a particular programming language or framework for runtimes.
- Providing probabilistic or "best effort" determinism grades.
- Allowing mutable or partial artifacts.
- The spec uses independent semantic versioning (
MAJOR.MINOR.PATCH).MAJORincrements for breaking changes to artifacts, lifecycle, or compliance rules.MINORincrements for additive requirements or optional extensions.PATCHincrements for errata that do not change behavior.
- Runtime/package versions (e.g.,
tracecore 0.9.x) SHALL declare the spec version they implement (e.g.,tracecore-spec 1.0). These numbers are intentionally decoupled. - Any runtime claiming compliance MUST embed
spec_versionin every artifact and expose it via CLI (tracecore version) and API metadata. - New spec versions MUST ship updated schema, checklist, and determinism documents within
/spec/.
- Added
wall_clock_elapsed_sas a REQUIRED artifact field (was absent in v0.1). - Promoted all "SHALL" and "SHOULD" to "MUST" throughout (was provisional in v0.1).
- Added Section 6: Batch Execution Requirements.
- Added Section 10: Changelog.
- Schema promoted to
artifact-schema-v1.0.json. spec_versionpattern updated to matchtracecore-spec-v1.0.