Skip to content

fix(agents): Explore agent type is *specifically* not to be used for code review#2098

Open
ben-alkov wants to merge 2 commits into
fullsend-ai:mainfrom
ben-alkov:feat/review-agent-parallel-specialized-subagents
Open

fix(agents): Explore agent type is *specifically* not to be used for code review#2098
ben-alkov wants to merge 2 commits into
fullsend-ai:mainfrom
ben-alkov:feat/review-agent-parallel-specialized-subagents

Conversation

@ben-alkov

@ben-alkov ben-alkov commented Jun 9, 2026

Copy link
Copy Markdown
Member

2nd EDIT:

In working this PR, I had to carefully review the Claude Code docs. This commit is an outcome of that review.

  • There is no such thing as a "subagent_type" - it's an hallucination. The only existing fields are listed in https://code.claude.com/docs/en/sub-agents#supported-frontmatter-fields
    • Agent definitions are not composable
    • The builtin agents (e.g. Explore) are defined in exactly the same way as custom agents
    • There is no mechanism for "subclassing" agents
  • The recommended way to restrict subagents is using one or both of tools/disallowedTools
  • permissionMode: dontAsk (unintuitively) forbids Claude Code from executing any action which would normally require user approval
  • There is no such thing as "run_in_background" - another hallucination
  • prompt is under-specified: it does not exist in the list of frontmatter fields mentioned above, but it can be used when invoking the --agents command line flag.
    initialPrompt exists, but its description is a little fuzzy. Going conservative here.

EDIT: This change also requires a supporting mechanism for blocking writes to "/sandbox/workspace/target-repo".

After much discussion, we've settled on a harness step which runs chmod -R a-w immediately after the copy takes place (obviously an agent could chmod it back, but at least we'd be able to analyze the session trace(s) to know why).


The review subagents are being invoked incorrectly; subagent_type doesn't exist. It's a complete hallucination

@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from 61695e7 to 878d86c Compare June 9, 2026 20:28
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Site preview

Preview: https://cd93ba24-site.fullsend-ai.workers.dev

Commit: c1d2b494eb6a1cfd8993d6768e781bed14344cf4

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 8:30 PM UTC · Completed 8:38 PM UTC
Commit: 27b1874 · View workflow run →

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review

Findings

Medium

  • [edge-case] internal/cli/run.go:693 — The chmod command chmod -R a-w <repo> && chmod -R u+w <repo>/.git removes write for all users then restores owner-write on .git/. The policy sets run_as_user: sandbox and run_as_group: sandbox. Files uploaded via sandbox.UploadDir are likely owned by root. The u+w flag restores write for the file owner (root), not for the sandbox user. Git operations needing write to .git/ (e.g., index.lock) will fail for the agent process running as sandbox.
    Remediation: Use chmod -R a+w on .git/ instead of u+w, or chown -R sandbox:sandbox the repo before chmod.

  • [fail-open] internal/scaffold/fullsend-repo/policies/review.yaml:13 — Setting include_workdir: false is intended to deny write access to the working directory, but line 13 grants read_write: [/sandbox, /tmp, /dev/null]. The target repo resides at /sandbox/workspace/<repo-name>. The /sandbox read_write grant independently permits writes to the entire workspace tree, potentially neutralizing the include_workdir: false change. Whether this is an actual bypass depends on how the sandbox runtime resolves conflicting grants.
    Remediation: Narrow the read_write grant to specific subdirectories needed rather than the entire /sandbox tree, or add the target repo path to read_only.

Low

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The pr-review skill explicitly acknowledges its departure from ADR-0018 (lines 13–19) and calls for a superseding ADR. This PR does not deepen the departure — it fixes the sub-agent dispatch mechanism (replacing hallucinated subagent_type with real Agent tool parameters). The architectural tension predates this PR.

  • [sandbox-escape] internal/cli/run.go:693 — The chmod compensating control is trivially reversible by the file owner. However, sub-agents are restricted to tools: Read, Grep, Glob (no Bash, Write, Edit) via the Agent tool's tools parameter, and permissionMode: dontAsk auto-denies unlisted tools. See also: [fail-open] finding at this location.

  • [error-handling] internal/cli/run.go:695 — When the chmod command fails, the code logs a warning and continues. Since readonly_repo: true is an explicit security opt-in, silently proceeding with a writable repo may violate the user's security intent.

Info

  • [permission-manifest-change] internal/harness/harness.go:218 — New ReadonlyRepo field defaults to false (Go zero value). Fail-closed — opt-in only.

  • [permission-manifest-change] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:362 — Sub-agent dispatch changed from hallucinated subagent_type: Explore to tools: Read, Grep, Glob with permissionMode: dontAsk. Net improvement for defense-in-depth.

  • [design-smell] internal/cli/run.go — The readonly_repo implementation uses chmod at the harness execution layer (steering) rather than the sandbox policy layer (enforcement).

  • [design-smell] internal/scaffold/fullsend-repo/policies/review.yamlinclude_workdir: false and readonly_repo: true create two overlapping mechanisms. The policy change is enforcement; chmod is supplemental steering.

Previous run

Review

Findings

Medium

  • [logic-error] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:522 — After the PR inserts a new step 2 in the challenger dispatch ("Extract model and name from frontmatter"), the subsequent steps shift: old step 3→4, old step 4→5, old step 5→6. However, step 5 ("Consume the challenger's output") still contains "fall back per step 5 below" — which is now a self-reference instead of pointing to step 6 (the fallback handler). This stale cross-reference could cause the orchestrator LLM to look for fallback instructions in the wrong step.
    Remediation: Update the cross-reference from "fall back per step 5 below" to "fall back per step 6 below".

  • [fail-open] internal/scaffold/fullsend-repo/policies/review.yaml:11 — Setting include_workdir: false is intended to deny write access to the working directory, but line 13 grants read_write: [/sandbox, /tmp, /dev/null]. Since the target repo lives under /sandbox/workspace/target-repo, the /sandbox read_write grant may independently permit writes to the entire workspace, potentially neutralizing the include_workdir: false change. The interaction between these two fields when include_workdir is false depends on the external sandbox enforcement layer's precedence rules, which are not documented here.
    Remediation: Narrow the read_write list to exclude /sandbox/workspace/target-repo, or confirm with the sandbox enforcement layer that include_workdir: false takes precedence over a broader read_write grant.

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The pr-review skill explicitly acknowledges its departure from ADR-0018 (lines 13–19) and notes that a superseding ADR is needed. This PR does not introduce the conflict — it already exists on main. However, the PR makes the orchestrator's sub-agent dispatch functional (replacing hallucinated parameters with working ones), which deepens reliance on the LLM-based orchestration pattern that ADR-0018 prohibits.
    Remediation: Write a superseding ADR that formally authorizes LLM-based sub-agent orchestration with documented mitigations.

Low

  • [missing-authorization] — This PR makes non-trivial changes (7 files, structural changes to harness, policy, and agent scaffolding) without a linked issue. The PR body explains the rationale but a linked issue would provide formal authorization.

  • [scope-exceeded] internal/scaffold/fullsend-repo/harness/review.yaml — The PR title claims a bug fix for the hallucinated subagent_type but the scope includes adding a new harness-level feature (readonly_repo) with chmod-based filesystem enforcement. The two changes are causally linked but could be separated for clearer review.

  • [edge-case] internal/cli/run.go:693 — The chmod command chmod -R a-w <dir> && chmod -R u+w <dir>/.git removes write permissions for all users, then restores owner-write on .git/. If the sandbox user (run_as_user: sandbox) is not the file owner, u+w restores write for the owner (not the sandbox user), which could break git operations like git log that need index.lock.

  • [sandbox-escape] internal/cli/run.go:693 — The chmod compensating control is trivially reversible by the file owner. With Bash removed from sub-agent tools (tools: Read, Grep, Glob), sub-agents cannot invoke chmod directly. The main review agent still has Bash access. The PR body acknowledges this as steering rather than enforcement.

Info

  • [permission-expansion] internal/harness/harness.go:218 — New ReadonlyRepo field defaults to false (Go zero value). This is fail-closed — the feature is opt-in only.

  • [naming-alignment] internal/harness/harness.go:218ReadonlyRepo uses "Readonly" as one word, while policy files use read_only (two words). Consistent with Go compound naming conventions.

  • [pattern-consistency] internal/scaffold/fullsend-repo/policies/review.yaml:11 — The review policy now sets include_workdir: false, making it the only policy file with this value. Intentional deviation for the read-only review agent.

  • [design-smell] internal/cli/run.go — The readonly_repo implementation uses chmod at the harness execution layer (steering) rather than in the sandbox policy layer (enforcement). The PR body acknowledges this tradeoff.

  • [new_feature_undocumented] internal/harness/harness.go — The PR adds the readonly_repo harness field but no documentation files reference it.

Previous run (2)

Review

Findings

High

  • [logic-error] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:495 — The challenger dispatch in step 6d (lines 495–498) still uses the old parameters subagent_type: Explore and prompt, while the dimension sub-agent dispatch in step 4 has been corrected to use name, tools, permissionMode, background, and initialPrompt. The PR's stated purpose is to fix the hallucinated subagent_type parameter, yet the challenger dispatch site was not updated. The name extraction step added to step 4 ("Extract model and name from frontmatter") is also missing from the challenger dispatch instructions. This will cause the challenger sub-agent to be invoked with non-existent parameters at runtime.
    Remediation: Apply the same parameter corrections from step 4 to the challenger dispatch in step 6d — replace subagent_type: Explore with the corrected parameters (name, tools, model, permissionMode, background, initialPrompt) and add the name extraction instruction.

Medium

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The diff replaces subagent_type: Explore with tools: Read, Grep, Glob, Bash and permissionMode: dontAsk. While the PR correctly identifies that the old parameter was non-functional, the replacement grants sub-agents the Bash tool, which can execute arbitrary shell commands (chmod u+w, echo > file, etc.). The chmod -R a-w compensating control is bypassable because the sandbox user retains file ownership and can restore write permissions. The prior subagent_type: Explore was non-functional (providing zero enforcement), so this PR actually adds a compensating control where none existed before — but the Bash tool remains a broader grant than necessary for read-only review sub-agents.

  • [fail-open] internal/scaffold/fullsend-repo/policies/review.yaml:11 — Setting include_workdir: false is intended to deny write access to the working directory, but the policy at line 13 explicitly grants read_write: [/sandbox, /tmp, /dev/null]. Since the working directory is under /sandbox/workspace/, the read_write entry independently grants write access to the entire workspace. The include_workdir: false change may be neutralized by this broader grant.

Low

  • [edge-case] internal/cli/run.go:693 — If ReadonlyRepo is true, chmod -R a-w removes write permissions from the entire repo tree (excluding .git/). If a sub-agent runs a Bash command that attempts to write to the repo, it will get a permission denied error with no agent-specific diagnostic. This is an unlikely scenario for correctly-behaving read-only sub-agents.

  • [stale-reference] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:314 — Step 3 says "Compose the spawn prompt from three parts:" but the actual composition lists five parts (Parts 1–5). This is a pre-existing inconsistency not introduced by this PR.

  • [sandbox-escape] internal/cli/run.go:693 — The chmod -R a-w compensating control is trivially reversible. The sandbox process runs as user sandbox, and files uploaded by the harness are likely owned by that user, meaning chmod u+w restores write permissions. The chmod failure path is non-fatal (StepWarn only). The PR body acknowledges this limitation.

Info

  • [naming-alignment] internal/harness/harness.go:218ReadonlyRepo uses "Readonly" as one word, while the policy files use read_only (two words) for the analogous concept. Consider ReadOnlyRepo / read_only_repo for consistency.

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/policies/review.yaml:11 — All other policy files (code.yaml, triage.yaml, fix.yaml, prioritize.yaml, retro.yaml) use include_workdir: true. The review policy is now the only exception. This is intentional for the read-only review agent but worth noting.

Previous run (3)

Review

Findings

Medium

  • [edge-case] internal/cli/run.go:644 — The chmod -R a-w (step 8-ro) runs before step 8a (AGENTS.md injection) and step 8a-2 (excludeAgentWorkingDirs). When ReadonlyRepo is true and the target repo lacks an AGENTS.md, sandbox.UploadFile at step 8a will fail because the target directory is read-only. Similarly, excludeAgentWorkingDirs writes to .git/info/exclude and will also fail because .git/ metadata is read-only.
    Remediation: Move the chmod -R a-w block to after step 8a-2, so AGENTS.md injection and git-exclude writes succeed before the filesystem is locked down.

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Changing subagent_type from Explore to general-purpose removes the tool-level read-only constraint on review sub-agents. Explore restricts to read-only tools; general-purpose grants access to file-writing tools (Edit, Write, Bash with write). The two compensating controls are insufficient: (1) chmod -R a-w is reversible by the sandbox user who retains file ownership; (2) include_workdir: false is neutralized by the explicit read_write: [/sandbox] entry which grants write access to the entire /sandbox tree.
    Remediation: Either keep Explore and address its limitation through other means, or if general-purpose is required, narrow /sandbox from read_write to exclude the target repo path.

Low

  • [fail-open] internal/scaffold/fullsend-repo/policies/review.yaml:11 — Setting include_workdir: false is intended to prevent the agent from writing to its working directory, but the policy's explicit read_write: [/sandbox, /tmp, /dev/null] at line 13 independently grants write access to the entire /sandbox prefix. Since the target repository resides under /sandbox/workspace/<repo>, the include_workdir: false setting may provide no effective protection.
    Remediation: Narrow the read_write entry from /sandbox to specific subdirectories that the agent legitimately needs write access to.

  • [sandbox-escape] internal/cli/run.go:644 — The chmod -R a-w compensating control is detective at best and trivially bypassable. The sandbox process runs as user sandbox, and since the files are uploaded by the setup process, the sandbox user may retain ownership and can chmod u+w.

  • [missing-test] internal/harness/harness.go:218 — The new ReadonlyRepo field has no test coverage. The interaction with subsequent steps (8a, 8a-2) represents a code path that would benefit from testing.

  • [documentation-accuracy] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The parenthetical comment says "Explore only reads excerpts - not whole files" as the rationale. The PR title says Explore is "specifically not to be used for code review", which is a different reason. If Explore reads whole files (just with write restrictions), this rationale is inaccurate.

  • [naming-convention] internal/cli/run.go:642 — Step comment uses 8-ro which breaks the established letter-suffix step numbering convention (8a, 8b, 8a-2, 8c).

Info

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/policies/review.yaml — All other policy files (code.yaml, triage.yaml, fix.yaml, retro.yaml, prioritize.yaml) use include_workdir: true. The review policy is now the only exception.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Changed from (read-only — sub-agents do not write) to (Explore only reads excerpts - not whole files). The new explanation documents a claimed limitation of Explore rather than the properties of general-purpose.

Previous run (4)

Review

Findings

Medium

  • [edge-case] internal/cli/run.go:644 — The chmod -R a-w (step 8-ro) runs before step 8a (AGENTS.md injection) and step 8a-2 (excludeAgentWorkingDirs). When ReadonlyRepo is true and the target repo lacks an AGENTS.md, sandbox.UploadFile at step 8a will fail because the target directory is read-only. Similarly, excludeAgentWorkingDirs writes to .git/info/exclude and will also fail because .git/ metadata is read-only.
    Remediation: Move the chmod -R a-w block to after step 8a-2, so AGENTS.md injection and git-exclude writes succeed before the filesystem is locked down.

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Changing subagent_type from Explore to general-purpose removes the tool-level read-only constraint on review sub-agents. Explore restricts to read-only tools; general-purpose grants access to file-writing tools (Edit, Write, Bash with write). The two compensating controls are insufficient: (1) chmod -R a-w is reversible by the sandbox user who retains file ownership; (2) include_workdir: false is neutralized by the explicit read_write: [/sandbox] entry which grants write access to the entire /sandbox tree.
    Remediation: Either keep Explore and address its limitation through other means, or if general-purpose is required, narrow /sandbox from read_write to exclude the target repo path.

Low

  • [fail-open] internal/scaffold/fullsend-repo/policies/review.yaml:11 — Setting include_workdir: false is intended to prevent the agent from writing to its working directory, but the policy's explicit read_write: [/sandbox, /tmp, /dev/null] at line 13 independently grants write access to the entire /sandbox prefix. Since the target repository resides under /sandbox/workspace/<repo>, the include_workdir: false setting may provide no effective protection.
    Remediation: Narrow the read_write entry from /sandbox to specific subdirectories that the agent legitimately needs write access to.

  • [sandbox-escape] internal/cli/run.go:644 — The chmod -R a-w compensating control is detective at best and trivially bypassable. The sandbox process runs as user sandbox, and since the files are uploaded by the setup process, the sandbox user may retain ownership and can chmod u+w.

  • [missing-test] internal/harness/harness.go:218 — The new ReadonlyRepo field has no test coverage. The interaction with subsequent steps (8a, 8a-2) represents a code path that would benefit from testing.

  • [documentation-accuracy] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The parenthetical comment says "Explore only reads excerpts - not whole files" as the rationale. The PR title says Explore is "specifically not to be used for code review", which is a different reason. If Explore reads whole files (just with write restrictions), this rationale is inaccurate.

  • [naming-convention] internal/cli/run.go:642 — Step comment uses 8-ro which breaks the established letter-suffix step numbering convention (8a, 8b, 8a-2, 8c).

Info

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/policies/review.yaml — All other policy files (code.yaml, triage.yaml, fix.yaml, retro.yaml, prioritize.yaml) use include_workdir: true. The review policy is now the only exception.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Changed from (read-only — sub-agents do not write) to (Explore only reads excerpts - not whole files). The new explanation documents a claimed limitation of Explore rather than the properties of general-purpose.

Previous run (5)

Review

Findings

Medium

  • [edge-case] internal/cli/run.go:644 — The chmod -R a-w operation at step 8-ro runs before step 8a (AGENTS.md injection) and step 8a-2 (excludeAgentWorkingDirs). If ReadonlyRepo is true AND the target repo has no AGENTS.md, the UploadFile at step 8a will fail because the target directory is now read-only. Similarly, excludeAgentWorkingDirs at step 8a-2 writes to .git/info/exclude and will also fail. Additionally, chmod -R a-w makes .git/ metadata read-only, which may break git operations that need .git/index.lock (e.g., git status).
    Remediation: Move the read-only chmod step (8-ro) to after all repo-directory write operations are complete (after step 8a-2 at minimum, and after step 8b if agent-input files target the repo directory). Consider excluding .git/ from the chmod to avoid breaking git operations.

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Changing subagent_type from Explore to general-purpose removes tool-level read-only enforcement on review sub-agents. The PR adds two compensating controls, but neither is fully effective: (1) chmod -R a-w is trivially reversible — the sandbox user retains ownership and can chmod +w it back; (2) include_workdir: false is likely inert because the explicit read_write: [/sandbox] entry in policies/review.yaml still grants write access to the entire /sandbox tree where the target repo resides at /sandbox/workspace/<repo>. The meta-prompt instruction "Do not write any files" is the remaining soft guard.
    Remediation: Either (a) investigate whether the Agent tool's dispatch accepts a disallowedTools parameter to block Write/Edit/NotebookEdit for sub-agents, (b) narrow the read_write policy entry from /sandbox to specific subdirectories that exclude the target repo path, or (c) add the target repo path to the read_only list if the policy engine supports precedence over read_write.

  • [fail-open] internal/scaffold/fullsend-repo/policies/review.yaml:11 — The include_workdir: false change is intended to remove write access to the working directory, but the policy's explicit read_write: [/sandbox, /tmp, /dev/null] at line 13 still grants write access to the entire /sandbox prefix, which includes the target repo at /sandbox/workspace/<repo-name>. This means one of the two claimed compensating controls for the Explore→general-purpose change may provide no effective write restriction, making it inert against the threat it is meant to address.
    Remediation: Narrow the read_write entry from /sandbox to specific subdirectories that exclude the target repo path (e.g., /sandbox/workspace/.env.d, /sandbox/workspace/output), or verify with the sandbox policy engine that include_workdir: false takes precedence over the broader read_write entry for the working directory path.

Low

  • [sandbox-escape] internal/cli/run.go:644 — The chmod compensating control is detective, not preventive — an agent can chmod it back since the sandbox user retains file ownership (as the PR author acknowledges). See also: [privilege-escalation] and [fail-open] findings above.

  • [missing-test] internal/harness/harness.go:218 — The new ReadonlyRepo field has no test coverage. The existing TestLoad_ValidHarness test does not include readonly_repo in its test YAML.

  • [missing-authorization] internal/cli/run.go:644 — This PR modifies security-relevant agent infrastructure across 6 files with no linked issue documenting the functional limitation, security tradeoff analysis, or testing validation. The PR body provides rationale and the requires-manual-review label indicates the author recognizes this needs scrutiny.

  • [naming-convention] internal/cli/run.go:642 — Step comment uses 8-ro which breaks the established letter-suffix step numbering convention used throughout run.go (e.g., 8a, 8b, 8a-2).

Previous run (6)

Review

Findings

Medium

  • [edge-case] internal/cli/run.go:644 — The chmod -R a-w operation at step 8-ro runs before step 8a (AGENTS.md injection) and step 8a-2 (excludeAgentWorkingDirs). If ReadonlyRepo is true AND the target repo has no AGENTS.md, the UploadFile at step 8a will fail because the target directory is now read-only. Similarly, excludeAgentWorkingDirs at step 8a-2 writes to .git/info/exclude and will also fail. Additionally, chmod -R a-w makes .git/ metadata read-only, which may break git operations that need .git/index.lock (e.g., git status).
    Remediation: Move the read-only chmod step (8-ro) to after all repo-directory write operations are complete (after step 8a-2 at minimum, and after step 8b if agent-input files target the repo directory). Consider excluding .git/ from the chmod to avoid breaking git operations.

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Changing subagent_type from Explore to general-purpose removes tool-level read-only enforcement on review sub-agents. The PR adds two compensating controls, but neither is fully effective: (1) chmod -R a-w is trivially reversible — the sandbox user retains ownership and can chmod +w it back; (2) include_workdir: false is likely inert because the explicit read_write: [/sandbox] entry in policies/review.yaml still grants write access to the entire /sandbox tree where the target repo resides at /sandbox/workspace/<repo>. The meta-prompt instruction "Do not write any files" is the remaining soft guard.
    Remediation: Either (a) investigate whether the Agent tool's dispatch accepts a disallowedTools parameter to block Write/Edit/NotebookEdit for sub-agents, (b) narrow the read_write policy entry from /sandbox to specific subdirectories that exclude the target repo path, or (c) add the target repo path to the read_only list if the policy engine supports precedence over read_write.

  • [fail-open] internal/scaffold/fullsend-repo/policies/review.yaml:11 — The include_workdir: false change is intended to remove write access to the working directory, but the policy's explicit read_write: [/sandbox, /tmp, /dev/null] at line 13 still grants write access to the entire /sandbox prefix, which includes the target repo at /sandbox/workspace/<repo-name>. This means one of the two claimed compensating controls for the Explore→general-purpose change may provide no effective write restriction, making it inert against the threat it is meant to address.
    Remediation: Narrow the read_write entry from /sandbox to specific subdirectories that exclude the target repo path (e.g., /sandbox/workspace/.env.d, /sandbox/workspace/output), or verify with the sandbox policy engine that include_workdir: false takes precedence over the broader read_write entry for the working directory path.

Low

  • [sandbox-escape] internal/cli/run.go:644 — The chmod compensating control is detective, not preventive — an agent can chmod it back since the sandbox user retains file ownership (as the PR author acknowledges). See also: [privilege-escalation] and [fail-open] findings above.

  • [missing-test] internal/harness/harness.go:218 — The new ReadonlyRepo field has no test coverage. The existing TestLoad_ValidHarness test does not include readonly_repo in its test YAML.

  • [missing-authorization] internal/cli/run.go:644 — This PR modifies security-relevant agent infrastructure across 6 files with no linked issue documenting the functional limitation, security tradeoff analysis, or testing validation. The PR body provides rationale and the requires-manual-review label indicates the author recognizes this needs scrutiny.

  • [naming-convention] internal/cli/run.go:642 — Step comment uses 8-ro which breaks the established letter-suffix step numbering convention used throughout run.go (e.g., 8a, 8b, 8a-2).

Previous run (7)

Review

Findings

Medium

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Switching subagent_type from Explore to general-purpose removes tool-level read-only enforcement on review sub-agents. The Explore type restricts sub-agents to read-only tools (Read, Grep, Glob, read-only Bash); general-purpose grants access to all tools including Write, Edit, and NotebookEdit. The parent review agent's disallowedTools: Write, Edit, NotebookEdit (review.md lines 9–10) applies only to the parent agent itself — sub-agents dispatched via the Agent tool receive independent tool access determined by their subagent_type. The only remaining write guard is the meta-prompt instruction at meta-prompt.md line 35 ("Do not write any files"), which is a soft prompt instruction, not a tool-level restriction. The PR author's rationale (Explore reads excerpts, not whole files) is valid for the functional concern but does not address the security trade-off of granting write capabilities to sub-agents that process untrusted PR diffs.
    Remediation: Investigate whether a subagent type exists that provides full file reading without granting write tools, or whether sub-agent dispatch can include disallowedTools to restrict Write/Edit/NotebookEdit while using general-purpose for full read access. If neither option exists, document the trade-off explicitly.

Low

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:497 — Same subagent_type change for the challenger sub-agent dispatch. Secondary instance of the same issue captured at line 363.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Inconsistent parenthetical explanations across the 3 change locations. SKILL.md lines 363 and 497 include (Explore only reads excerpts - not whole files) explaining why the old value was wrong, but review.md line 97 has no parenthetical at all. The original comment pattern described the chosen value's properties (e.g., (read-only — sub-agents do not write)); the new comments describe why the old value was wrong without documenting the new value's behavior.

  • [unverifiable-claim] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The parenthetical (Explore only reads excerpts - not whole files) references Explore agent behavior documented in Claude Code's external documentation, not in this repository. Repo maintainers cannot verify this claim from in-repo sources.

  • [design-smell] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — No in-repo evidence that general-purpose is a valid subagent_type value accepted by the Agent tool. The sub-agent roster table documents model values but not subagent_type.

Previous run (8)

Review

Findings

Medium

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Switching subagent_type from Explore to general-purpose removes tool-level read-only enforcement on review sub-agents. The Explore type restricts sub-agents to read-only tools (Read, Grep, Glob, read-only Bash); general-purpose grants access to all tools including Write, Edit, and NotebookEdit. The parent review agent's disallowedTools: Write, Edit, NotebookEdit (review.md lines 9–10) applies only to the parent agent itself — sub-agents dispatched via the Agent tool receive independent tool access determined by their subagent_type. The only remaining write guard is the meta-prompt instruction at meta-prompt.md line 35 ("Do not write any files"), which is a soft prompt instruction, not a tool-level restriction. The PR author's rationale (Explore reads excerpts, not whole files) is valid for the functional concern but does not address the security trade-off of granting write capabilities to sub-agents that process untrusted PR diffs.
    Remediation: Investigate whether a subagent type exists that provides full file reading without granting write tools, or whether sub-agent dispatch can include disallowedTools to restrict Write/Edit/NotebookEdit while using general-purpose for full read access. If neither option exists, document the trade-off explicitly.

Low

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:497 — Same subagent_type change for the challenger sub-agent dispatch. Secondary instance of the same issue captured at line 363.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Inconsistent parenthetical explanations across the 3 change locations. SKILL.md lines 363 and 497 include (Explore only reads excerpts - not whole files) explaining why the old value was wrong, but review.md line 97 has no parenthetical at all. The original comment pattern described the chosen value's properties (e.g., (read-only — sub-agents do not write)); the new comments describe why the old value was wrong without documenting the new value's behavior.

  • [unverifiable-claim] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The parenthetical (Explore only reads excerpts - not whole files) references Explore agent behavior documented in Claude Code's external documentation, not in this repository. Repo maintainers cannot verify this claim from in-repo sources.

  • [design-smell] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — No in-repo evidence that general-purpose is a valid subagent_type value accepted by the Agent tool. The sub-agent roster table documents model values but not subagent_type.

Previous run (9)

Review

Findings

Medium

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Changing subagent_type from Explore to general-purpose removes a technical guardrail that enforced read-only access for review sub-agents. The Explore type restricts sub-agents to read-only tools (Read, Grep, Glob, Bash), while general-purpose grants access to ALL tools including Write, Edit, and NotebookEdit. The parent agent's disallowedTools (Write, Edit, NotebookEdit at agents/review.md:9-10) does not propagate to sub-agents. With general-purpose, the only barrier to file writes is the soft prompt instruction "Do not write any files" in meta-prompt.md:35, which is bypassable by adversarial input in PR diffs. The PR's functional motivation is legitimate — sub-agents need full-file reads that Explore apparently cannot provide — but the defense-in-depth regression is real.
    Remediation: Investigate whether a subagent type exists that provides full-file reading without granting write tools. If not, document the security tradeoff and consider whether the parent agent's disallowedTools can be propagated to sub-agents as a compensating control.

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:497 — Same privilege escalation applied to the challenger sub-agent dispatch. The challenger processes untrusted findings and diff content; granting it write tool access via general-purpose removes the technical enforcement that Explore provided.
    Remediation: Same as above — retain Explore or introduce a read-only subagent type with full file reading capability.

  • [sandbox-escape] internal/scaffold/fullsend-repo/agents/review.md:97 — The review agent's frontmatter declares disallowedTools: Write, Edit, NotebookEdit (line 9-10) but sub-agents dispatched as general-purpose would have access to those tools. This creates a parent-child permission inconsistency where sub-agents have broader capabilities than the agent that spawned them.
    Remediation: Ensure sub-agents inherit the parent agent's disallowedTools restrictions, or keep sub-agents on a read-only subagent type.

  • [missing-authorization] No linked issue. This PR modifies a security-relevant parameter (subagent_type) in the review orchestrator without a linked issue documenting the functional limitation, security tradeoff analysis, or testing results.
    Remediation: Create an issue documenting the Explore limitation that motivates this change, the security tradeoff analysis, and testing results. Link that issue to this PR.

Low

  • [design-smell] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The parenthetical comment (Explore only reads excerpts - not whole files) is an unverifiable claim within this repository — no internal documentation defines what Explore vs general-purpose subagent types actually do. The PR body quotes external documentation but that source is not referenced in the repo.

  • [scope-creep] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — No evidence of testing. The PR provides no evidence of failures with the current Explore setting or successful runs with general-purpose. A change affecting how all 7 review sub-agents are spawned warrants validation.

Previous run

Review

Findings

High

  • [stale-reference] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:497 — The diff changes subagent_type from Explore to Default for the parallel sub-agent dispatch (line 363) but does not update the identical subagent_type: Explore instruction for the challenger sub-agent dispatch at line 497 (step 6d). The challenger sub-agent has the same read-full-files requirement as the other sub-agents and would suffer the same truncation problem the PR aims to fix.
    Remediation: Change line 497 from subagent_type: Explore (read-only) to subagent_type: Default, matching the fix applied at line 363.

Low

  • [stale-reference] internal/scaffold/fullsend-repo/agents/review.md:97 — The agent definition file agents/review.md line 97 still says subagent_type: Explore in the skill routing description for pr-review. This contradicts the updated SKILL.md instruction. The line is in a descriptive prose section rather than an executable dispatch instruction, so the risk of runtime impact is low, but it should be kept consistent.
  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The new value Default is not wrapped in backtick formatting, unlike all other parameter values in the same list (model, run_in_background, prompt). The original Explore was wrapped in backticks. The lack of backticks means the orchestrator LLM sees Default as prose rather than a literal value, which marginally increases the chance of misinterpretation.
  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The explanatory comment (read-only — sub-agents do not write) was removed along with the Explore value. The parenthetical clarified the intent of the subagent_type choice. Since Default presumably does not have the same read-only constraint as Explore, the old comment would be misleading if preserved verbatim, but a replacement comment explaining the rationale for Default would aid future readers.
Previous run (10)

Review

Findings

Medium

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — Changing subagent_type from Explore to general-purpose removes a technical guardrail that enforced read-only access for review sub-agents. The Explore type restricts sub-agents to read-only tools (Read, Grep, Glob, Bash), while general-purpose grants access to ALL tools including Write, Edit, and NotebookEdit. The parent agent's disallowedTools (Write, Edit, NotebookEdit at agents/review.md:9-10) does not propagate to sub-agents. With general-purpose, the only barrier to file writes is the soft prompt instruction "Do not write any files" in meta-prompt.md:35, which is bypassable by adversarial input in PR diffs. The PR's functional motivation is legitimate — sub-agents need full-file reads that Explore apparently cannot provide — but the defense-in-depth regression is real.
    Remediation: Investigate whether a subagent type exists that provides full-file reading without granting write tools. If not, document the security tradeoff and consider whether the parent agent's disallowedTools can be propagated to sub-agents as a compensating control.

  • [privilege-escalation] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:497 — Same privilege escalation applied to the challenger sub-agent dispatch. The challenger processes untrusted findings and diff content; granting it write tool access via general-purpose removes the technical enforcement that Explore provided.
    Remediation: Same as above — retain Explore or introduce a read-only subagent type with full file reading capability.

  • [sandbox-escape] internal/scaffold/fullsend-repo/agents/review.md:97 — The review agent's frontmatter declares disallowedTools: Write, Edit, NotebookEdit (line 9-10) but sub-agents dispatched as general-purpose would have access to those tools. This creates a parent-child permission inconsistency where sub-agents have broader capabilities than the agent that spawned them.
    Remediation: Ensure sub-agents inherit the parent agent's disallowedTools restrictions, or keep sub-agents on a read-only subagent type.

  • [missing-authorization] No linked issue. This PR modifies a security-relevant parameter (subagent_type) in the review orchestrator without a linked issue documenting the functional limitation, security tradeoff analysis, or testing results.
    Remediation: Create an issue documenting the Explore limitation that motivates this change, the security tradeoff analysis, and testing results. Link that issue to this PR.

Low

  • [design-smell] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The parenthetical comment (Explore only reads excerpts - not whole files) is an unverifiable claim within this repository — no internal documentation defines what Explore vs general-purpose subagent types actually do. The PR body quotes external documentation but that source is not referenced in the repo.

  • [scope-creep] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — No evidence of testing. The PR provides no evidence of failures with the current Explore setting or successful runs with general-purpose. A change affecting how all 7 review sub-agents are spawned warrants validation.

Previous run (11)

Review

Findings

High

  • [stale-reference] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:497 — The diff changes subagent_type from Explore to Default for the parallel sub-agent dispatch (line 363) but does not update the identical subagent_type: Explore instruction for the challenger sub-agent dispatch at line 497 (step 6d). The challenger sub-agent has the same read-full-files requirement as the other sub-agents and would suffer the same truncation problem the PR aims to fix.
    Remediation: Change line 497 from subagent_type: Explore (read-only) to subagent_type: Default, matching the fix applied at line 363.

Low

  • [stale-reference] internal/scaffold/fullsend-repo/agents/review.md:97 — The agent definition file agents/review.md line 97 still says subagent_type: Explore in the skill routing description for pr-review. This contradicts the updated SKILL.md instruction. The line is in a descriptive prose section rather than an executable dispatch instruction, so the risk of runtime impact is low, but it should be kept consistent.
  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The new value Default is not wrapped in backtick formatting, unlike all other parameter values in the same list (model, run_in_background, prompt). The original Explore was wrapped in backticks. The lack of backticks means the orchestrator LLM sees Default as prose rather than a literal value, which marginally increases the chance of misinterpretation.
  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:363 — The explanatory comment (read-only — sub-agents do not write) was removed along with the Explore value. The parenthetical clarified the intent of the subagent_type choice. Since Default presumably does not have the same read-only constraint as Explore, the old comment would be misleading if preserved verbatim, but a replacement comment explaining the rationale for Default would aid future readers.

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
@ben-alkov ben-alkov changed the title fix(agents): Explore agent is *specifically* not to be used for code review... fix(agents): Explore agent type is *specifically* not to be used for code review... Jun 9, 2026
@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch 2 times, most recently from a6bfcd5 to b4d01d4 Compare June 9, 2026 20:57
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 8:59 PM UTC · Completed 9:11 PM UTC
Commit: 27b1874 · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/scaffold/fullsend-repo/agents/review.md Outdated
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
@rh-hemartin

Copy link
Copy Markdown
Member

Good catch!

@ben-alkov ben-alkov requested a review from ralphbean June 10, 2026 12:46
@ben-alkov

Copy link
Copy Markdown
Member Author

@rh-hemartin, @rbean - FWIW, FYI; The Explore only reads excerpts - not whole files[1] does not show up anywhere in Claude Code's docs - I hit a lucky break with Duckduckgo and found the exact reference.

In my mind, the likelihood of Claude Code discovering this randomly, during a code review, is miniscule, so, by Occam's Chisel (not as nice/precise as the Razor 😁) it must be part of the system prompt[2][3].

TL;DR - it's not an hallucination.

Having said all of that: the security implications are real, and severe. If I can get an answer in PR 192, and the answer is "diffs are the way", then we can stick with Explore type and moot this PR.

[1] Summarizing, not quoting verbatim.
[2] Although I don't know why it wouldn't be in user-facing docs...
[3] ...and the review agent in this PR has never heard of it.

@ralphbean

Copy link
Copy Markdown
Member

Ok, thinking. Explore has the right permission profile (no write tools) but the wrong behavioral disposition (reads excerpts, not whole files). Switching to general-purpose fixes the behavioral problem but opens a privilege escalation surface because disallowedTools on the parent agent doesn't propagate to sub-agents.

Can we solve this at the sandbox layer instead of the agent type layer?

Right now policies/review.yaml has read_write: [/sandbox, /tmp, /dev/null], and the repo checkout lives at /sandbox/workspace/target-repo/ — so the agent can write to it. If landlock supports hierarchical restrictions, we could add the repo path to read_only:

filesystem_policy:
  include_workdir: true
  read_only: [/usr, /lib, /proc, /dev/urandom, /app, /etc, /var/log, /sandbox/workspace/target-repo]
  read_write: [/sandbox, /tmp, /dev/null]

With that, it wouldn't matter what subagent_type we use — the kernel blocks writes to the repo regardless. A prompt injection in a PR diff could convince a sub-agent to try Edit, but the syscall would fail. The rest of /sandbox stays writable for output files (agent-result.json, etc).

Then, we could make sub-agents full "general-purpose" agents.

Does usage of read_only like that actually work?

@ben-alkov

ben-alkov commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

EDIT: Experiment at fullsend-ai/experiments/pull/31

tl;dr - No.


Does usage of read_only like that actually work?

🤷

@maruiz93? 🤣

@ralphbean

Copy link
Copy Markdown
Member

Try it! :D

@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from b4d01d4 to bb7b589 Compare June 15, 2026 19:33
@ben-alkov ben-alkov marked this pull request as draft June 15, 2026 19:34
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

🤖 Review · ❌ Terminated · Started 7:36 PM UTC · Ended 7:49 PM UTC
Commit: 4e21a60 · View workflow run →

@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/cli/run.go 0.00% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

@fullsend-ai-review fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 15, 2026
@fullsend-ai-review

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:36 PM UTC · Completed 7:49 PM UTC
Commit: bb7b589 · View workflow run →

@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from bb7b589 to 617f74d Compare June 17, 2026 13:17
@ben-alkov ben-alkov marked this pull request as ready for review June 17, 2026 13:18
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 17, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 1:22 PM UTC · Completed 1:39 PM UTC
Commit: 617f74d · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

Comment thread internal/cli/run.go Outdated
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/scaffold/fullsend-repo/policies/review.yaml
@fullsend-ai-review fullsend-ai-review Bot removed the requires-manual-review Review requires human judgment label Jun 17, 2026
@ben-alkov ben-alkov changed the title fix(agents): Explore agent type is *specifically* not to be used for code review... fix(agents): Explore agent type is *specifically* not to be used for code review Jun 17, 2026
@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from 617f74d to f4dca6d Compare June 17, 2026 16:17
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 17, 2026

Copy link
Copy Markdown

🤖 Review · ❌ Terminated · Started 4:21 PM UTC · Ended 4:35 PM UTC
Commit: 4e21a60 · View workflow run →

@ben-alkov ben-alkov self-assigned this Jun 17, 2026

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

Comment thread internal/cli/run.go Outdated
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/scaffold/fullsend-repo/policies/review.yaml
Comment thread internal/cli/run.go Outdated
Comment thread internal/harness/harness.go
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/cli/run.go Outdated
@fullsend-ai-review

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 4:21 PM UTC · Completed 4:35 PM UTC
Commit: f4dca6d · View workflow run →

@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from f4dca6d to 4f25a42 Compare June 17, 2026 17:49
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 17, 2026

Copy link
Copy Markdown

🤖 Review · ⚠️ Cancelled · Started 5:53 PM UTC · Ended 5:56 PM UTC
Commit: 4e21a60 · View workflow run →

@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from 4f25a42 to 222769c Compare June 17, 2026 17:55
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 17, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 6:00 PM UTC · Completed 6:19 PM UTC
Commit: 222769c · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md
Comment thread internal/scaffold/fullsend-repo/policies/review.yaml
Comment thread internal/cli/run.go
Comment thread internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md Outdated
Comment thread internal/cli/run.go
Comment thread internal/harness/harness.go
Comment thread internal/scaffold/fullsend-repo/policies/review.yaml

@ralphbean ralphbean left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo at review.md:98 — pre-review should be pr-review. The challenger dispatch at step 6d (SKILL.md ~495) still uses the old params too. And I'm not confident the new parameter names are right — see inline.

- `run_in_background`: `true`
- `prompt`: composed from parts 1–5
the Agent tool's `model` parameter)
- `permissionMode`: `dontAsk`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[important] Are these parameter names verified against a live Agent tool call? The schema I can see has prompt, run_in_background, subagent_type, description — the new ones (name, tools, permissionMode, background, initialPrompt) don't appear. If we're swapping one set of unverified params for another, we should confirm they actually work before merging.

Also — the challenger dispatch at step 6d (~line 495) still uses the old subagent_type: Explore / prompt params. Worth updating to match.

@ben-alkov ben-alkov Jun 17, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified in the Anthropic docs https://code.claude.com/docs/en/sub-agents#supported-frontmatter-fields

EDIT: I keep forgetting that you guys don't necessarily read commit messages 😁

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which schema?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated challenger

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Agent tool's own JSON schema — you can see it in any Claude Code session by looking at what parameters the Agent tool accepts. It takes description, prompt, subagent_type, run_in_background, resume, and isolation.

The page you linked (code.claude.com/docs/en/sub-agents#supported-frontmatter-fields) lists frontmatter fields for agent definition .md files — name, tools, permissionMode, initialPrompt, etc. These go in the .md file's YAML header, not in the Agent tool call.

So there are two interfaces here, and I think the PR is crossing them:

  • Agent tool call: the orchestrator calls the Agent tool with prompt, subagent_type, run_in_background
  • Sub-agent definition frontmatter: the .md file declares name, tools, permissionMode, etc.

The old SKILL.md used valid Agent tool params (subagent_type: Explore, run_in_background: true, prompt). The new version uses frontmatter fields as if they were Agent tool params.

ADR 0027 in this repo covers the distinction — tools in frontmatter restricts what a subagent can access, but it's a property of the definition file, not something you pass to the Agent tool.

If the goal is to restrict sub-agent tools to Read, Grep, Glob (which I think is a good idea), we could add tools: [Read, Grep, Glob] to the sub-agent .md frontmatter and keep using valid Agent tool params for the invocation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to restrict sub-agent tools to Read, Grep, Glob (which I think is a good idea), we could add tools: [Read, Grep, Glob] to the sub-agent .md frontmatter

Agreed, but I'd take it a step further: IMO, all of that argument marshalling in the skill is wasted tokens anyway; Claude Code knows how to invoke subagents (using the Agent tool, which presumably Claude Code knows how to populate from the subagent definitions) - all we need to do is instruct it to build up the prompt the way we want.

Comment thread internal/scaffold/fullsend-repo/agents/review.md Outdated
@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from 222769c to e17c1a8 Compare June 17, 2026 19:58
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 17, 2026

Copy link
Copy Markdown

🤖 Review · ⚠️ Cancelled · Started 8:02 PM UTC · Ended 8:03 PM UTC
Commit: 4e21a60 · View workflow run →

@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch 2 times, most recently from d8c8385 to 533d579 Compare June 17, 2026 20:04
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 17, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 8:08 PM UTC · Completed 8:25 PM UTC
Commit: 533d579 · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

Comment thread internal/scaffold/fullsend-repo/policies/review.yaml
Comment thread internal/scaffold/fullsend-repo/harness/review.yaml
Comment thread internal/cli/run.go
Comment thread internal/cli/run.go
Comment thread internal/harness/harness.go
Comment thread internal/harness/harness.go
Comment thread internal/scaffold/fullsend-repo/policies/review.yaml
Comment thread internal/cli/run.go
Comment thread internal/harness/harness.go
In working this PR, I had to *carefully* review the Claude Code docs. *This* commit is an outcome of that review.

- There is no such thing as a "subagent_type" - it's an hallucination. The only existing fields are listed in https://code.claude.com/docs/en/sub-agents#supported-frontmatter-fields
  - Agent definitions *are not composable*
  - The builtin agents (e.g. Explore) are defined in *exactly* the same way as custom agents
  - There is no mechanism for "subclassing" agents
- The recommended way to restrict subagents is using one or both of `tools`/`disallowedTools`
- `permissionMode`: `dontAsk` (unintuitively) forbids Claude Code from executing any action *which would normally require user approval*
- There is no such thing as "run_in_background" - another hallucination
- `prompt` is under-specified: it *does not* exist in the list of frontmatter fields mentioned above, but it *can* be used when invoking the `--agents` command line flag. `initialPrompt` exists, but its description is a little fuzzy. Going conservative here.

Signed-off-by: Ben Alkov <ben.alkov@redhat.com>
Review agents should not modify the target repository

- add harness field `readonly_repo` (boolean) which enforces `chmod -R a-w` on the cloned repo, after copy
- tighten the review agent's sandbox policy to exclude workdir writes

Assisted-by: Claude Code (Opus 4.6)
Signed-off-by: Ben Alkov <ben.alkov@redhat.com>
@ben-alkov ben-alkov force-pushed the feat/review-agent-parallel-specialized-subagents branch from 533d579 to c1d2b49 Compare June 18, 2026 14:29
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 18, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 2:33 PM UTC · Completed 2:48 PM UTC
Commit: c1d2b49 · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

requires-manual-review Review requires human judgment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants