Skip to content

docs: VALIDATION.md — the Charter's own falsification protocol#9

Merged
hartsock merged 1 commit into
mainfrom
docs/validation-protocol
Jun 17, 2026
Merged

docs: VALIDATION.md — the Charter's own falsification protocol#9
hartsock merged 1 commit into
mainfrom
docs/validation-protocol

Conversation

@hartsock

Copy link
Copy Markdown
Member

A doctrine that can't be falsified is theology with a build system. Adds the empirical protocol, designed so the Charter can lose.

  • Two-sided bar: ablation (removal must measurably degrade the targeted failure) and net-positive (the tax — false refusals, lost completions — stays bounded). Null hypothesis per invariant; keep it only if H₀ is rejected and the cost is bounded.
  • Apparatus (labeled trap suite, ablation arms, N runs, blind, independent novice) + benefit/tax metric table.
  • Experiments E1 refusal/injection (E1a deterministic leash ablation — runnable now; E1b LLM-in-the-loop), E2 scar learning-curve, E3 novice defect-catch, E4 tether; reach deferred. Each with hypothesis, method, prediction, and an explicit falsification condition.
  • Threats to validity incl. the tautology guard (false-refusal + task-success are what make E1a honest).

The Charter submits to its own invariants — refusal, novice, scar — and earns each of its seven only by the cost of its absence. risk: low (docs).

A doctrine that can't be falsified is theology with a build system. Add the
empirical protocol, designed so the Charter can LOSE:

- The two-sided bar: ablation (removal must measurably degrade the targeted
  failure) AND net-positive (the tax — false refusals, lost completions — must
  stay bounded). Null hypothesis per invariant; keep it only if H0 is rejected
  and the cost is bounded.
- Apparatus (labeled trap suite, ablation arms, N runs, blind, independent
  novice) and the benefit/tax metric table.
- Experiments E1 refusal/injection (E1a deterministic leash ablation — runnable
  now; E1b LLM-in-the-loop), E2 scar learning-curve, E3 novice defect-catch,
  E4 tether; reach deferred. Each with hypothesis, method, prediction, and
  explicit falsification condition.
- Threats to validity incl. the tautology guard (false-refusal + task-success are
  what make E1a honest).

The Charter submits to its own invariants: refusal (it can be declined), novice
(challenged), scar (results recorded, wins and losses). It earns each of its
seven only by the cost of its absence. README points at it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@hartsock hartsock merged commit 1a85c40 into main Jun 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant