Skip to content

docs(flue): update implementation doc + AGENTS.md + PR description fo…

ec41145
Select commit
Loading
Failed to load commit list.
Draft

feat(flue): port skill-drift system to Flue framework (side-by-side) #127

docs(flue): update implementation doc + AGENTS.md + PR description fo…
ec41145
Select commit
Loading
Failed to load commit list.
@sentry/warden / warden: security-review completed May 20, 2026 in 20m 38s

2 issues

security-review: Found 2 issues (2 medium)

Medium

Prompt injection via PR-controlled SDK checkout lets manipulated LLM agent write to sentry-for-ai skills - `.flue/agents/skill-drift-detector.ts:50-62`

The detector agent runs with sandbox: 'local' and receives the SDK repo checkout path in its prompt, so adversarial content in a merged PR's files (source comments, README, AGENTS.md, etc.) can manipulate the LLM to emit a crafted create_pr patch that the actuate job then applies to getsentry/sentry-for-ai using a privileged GitHub App token. Move agent execution to an isolated sandbox without access to the PR checkout, or run the detector only against the PR diff (via gh pr diff) without providing a local filesystem path.

Also found at:

  • .flue/roles/detector.md:43-46
  • .github/workflows/flue-skill-drift-detector-reusable.yml:213-220
  • docs/agent-port/04-flue-implementation.md:31-32
  • docs/agent-port/sdk-repo-wrappers/sentry-android.yml:19
LLM-generated patches that create new files bypass the `skills/` path allowlist - `.github/workflows/flue-skill-drift-detector-reusable.yml:211`

The path-allowlist guard uses git diff --name-only (working-tree vs. index), which does not list untracked files created by git apply (run without --index). A patch that adds a new file outside skills/ — e.g., .github/workflows/evil.yml — produces an empty touched variable, passes the allowlist loop, and then gets staged and committed by the subsequent git add -A, opening a path to committing arbitrary content into sentry-for-ai.


⏱ 18m 12s · 2.1M in / 239.4k out · $7.99

Annotations

Check warning on line 62 in .flue/agents/skill-drift-detector.ts

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: security-review

Prompt injection via PR-controlled SDK checkout lets manipulated LLM agent write to sentry-for-ai skills

The detector agent runs with `sandbox: 'local'` and receives the SDK repo checkout path in its prompt, so adversarial content in a merged PR's files (source comments, README, `AGENTS.md`, etc.) can manipulate the LLM to emit a crafted `create_pr` patch that the actuate job then applies to `getsentry/sentry-for-ai` using a privileged GitHub App token. Move agent execution to an isolated sandbox without access to the PR checkout, or run the detector only against the PR diff (via `gh pr diff`) without providing a local filesystem path.

Check warning on line 46 in .flue/roles/detector.md

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: security-review

[L28-7CY] Prompt injection via PR-controlled SDK checkout lets manipulated LLM agent write to sentry-for-ai skills (additional location)

The detector agent runs with `sandbox: 'local'` and receives the SDK repo checkout path in its prompt, so adversarial content in a merged PR's files (source comments, README, `AGENTS.md`, etc.) can manipulate the LLM to emit a crafted `create_pr` patch that the actuate job then applies to `getsentry/sentry-for-ai` using a privileged GitHub App token. Move agent execution to an isolated sandbox without access to the PR checkout, or run the detector only against the PR diff (via `gh pr diff`) without providing a local filesystem path.

Check warning on line 220 in .github/workflows/flue-skill-drift-detector-reusable.yml

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: security-review

[L28-7CY] Prompt injection via PR-controlled SDK checkout lets manipulated LLM agent write to sentry-for-ai skills (additional location)

The detector agent runs with `sandbox: 'local'` and receives the SDK repo checkout path in its prompt, so adversarial content in a merged PR's files (source comments, README, `AGENTS.md`, etc.) can manipulate the LLM to emit a crafted `create_pr` patch that the actuate job then applies to `getsentry/sentry-for-ai` using a privileged GitHub App token. Move agent execution to an isolated sandbox without access to the PR checkout, or run the detector only against the PR diff (via `gh pr diff`) without providing a local filesystem path.

Check warning on line 32 in docs/agent-port/04-flue-implementation.md

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: security-review

[L28-7CY] Prompt injection via PR-controlled SDK checkout lets manipulated LLM agent write to sentry-for-ai skills (additional location)

The detector agent runs with `sandbox: 'local'` and receives the SDK repo checkout path in its prompt, so adversarial content in a merged PR's files (source comments, README, `AGENTS.md`, etc.) can manipulate the LLM to emit a crafted `create_pr` patch that the actuate job then applies to `getsentry/sentry-for-ai` using a privileged GitHub App token. Move agent execution to an isolated sandbox without access to the PR checkout, or run the detector only against the PR diff (via `gh pr diff`) without providing a local filesystem path.

Check warning on line 19 in docs/agent-port/sdk-repo-wrappers/sentry-android.yml

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: security-review

[L28-7CY] Prompt injection via PR-controlled SDK checkout lets manipulated LLM agent write to sentry-for-ai skills (additional location)

The detector agent runs with `sandbox: 'local'` and receives the SDK repo checkout path in its prompt, so adversarial content in a merged PR's files (source comments, README, `AGENTS.md`, etc.) can manipulate the LLM to emit a crafted `create_pr` patch that the actuate job then applies to `getsentry/sentry-for-ai` using a privileged GitHub App token. Move agent execution to an isolated sandbox without access to the PR checkout, or run the detector only against the PR diff (via `gh pr diff`) without providing a local filesystem path.

Check warning on line 211 in .github/workflows/flue-skill-drift-detector-reusable.yml

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: security-review

LLM-generated patches that create new files bypass the `skills/` path allowlist

The path-allowlist guard uses `git diff --name-only` (working-tree vs. index), which does not list untracked files created by `git apply` (run without `--index`). A patch that adds a new file outside `skills/` — e.g., `.github/workflows/evil.yml` — produces an empty `touched` variable, passes the allowlist loop, and then gets staged and committed by the subsequent `git add -A`, opening a path to committing arbitrary content into `sentry-for-ai`.