feat(flue): port skill-drift system to Flue framework (side-by-side) by HazAT · Pull Request #127 · getsentry/sentry-for-ai

HazAT · 2026-05-12T15:29:12Z

Caution

This PR still depends on the @mistralai/mistralai supply-chain mitigation work; keep the PR in Draft until the advisory requirement is fully satisfied.

Summary

Ports Flue skill-drift from a centralized weekly scheduler in this repo to an inverted architecture: each SDK repo runs its own per-PR detector trigger and invokes a shared reusable workflow in sentry-for-ai.

What's in this PR

Reusable Detector workflow in this repo: .github/workflows/flue-skill-drift-detector-reusable.yml
19 SDK-repo wrapper templates in docs/agent-port/sdk-repo-wrappers/
Updater + Creator as local CLI tools only:
- ./scripts/test-flue-updater.sh
- ./scripts/test-flue-creator.sh
Full Flue project scaffold and supply-chain mitigation (@mistralai/mistralai) remaining intact

Architecture

┌───────────────────────────────────────────────────────────────┐
│ Per-SDK-repo workflow (e.g. getsentry/sentry-android)         │
│ on: pull_request: types: [closed]                             │
│ if: pull_request.merged == true                               │
│ uses: getsentry/sentry-for-ai/.github/workflows/                │
│   flue-skill-drift-detector-reusable.yml@main                 │
└───────────────────┬───────────────────────────────────────────┘
                    │ workflow_call with skill_name, sdk_repo,
                    │ pr_number, pr_url
                    ▼
┌───────────────────────────────────────────────────────────────┐
│ Reusable workflow in getsentry/sentry-for-ai                  │
│ • detect job: checkout SDK + skills repo, run Flue agent,     │
│   output JSON actions array                                    │
│ • actuate job: apply patches, open PRs/issues in              │
│   getsentry/sentry-for-ai via GitHub App token                │
└───────────────────────────────────────────────────────────────┘
                    │ (skill-drift labeled PR opens)
                    ▼
┌───────────────────────────────────────────────────────────────┐
│ skill-drift-assign-reviewers.yml (unchanged)                  │
│ Routes the PR to the right SDK team based on changed paths    │
└───────────────────────────────────────────────────────────────┘

Separately (local-only, no CI trigger):
┌───────────────────────────────────────────────────────────────┐
│ Updater & Skill Creator                                        │
│ Invoked via ./scripts/test-flue-updater.sh and                │
│ ./scripts/test-flue-creator.sh                                │
│ Edits files locally; human reviews and opens PR manually       │
└───────────────────────────────────────────────────────────────┘

How to test locally

Updater: ./scripts/test-flue-updater.sh [--issue <N>|--fixture]
Creator: ./scripts/test-flue-creator.sh <platform> [prompt]
Detector:
- ./scripts/test-flue-detector.sh <skill_name> <sdk_repo> <pr_number> [sdk_repo_path]
- or run a normal SDK PR merge when wrappers are onboarded

Pending follow-ups

Complete per-repo detector rollout by adding the wrapper workflow to the 19 SDK repos and validating output.
Remove old gh-aw-specific docs and assets only after all per-repo detectors are stable.

Review findings

Type	File	Notes
commit	`3c0d201`	removed Updater + Creator GH-Actions workflows; kept tools local only
commit	`e47071b`	redesigned Detector for single-PR, single-skill invocation
commit	`27367df`	converted Detector to reusable workflow + added per-repo wrappers

Scaffolds the Node.js/Flue project structure at repo root as the first step of porting the skill-drift agents from GitHub Agentic Workflows to Flue. See .pi/plans/2026-05-12-flue-skill-drift/plan.md for the full plan. Dependencies (pinned to exact versions): @flue/sdk 0.5.3 @flue/cli 0.5.3 valibot 1.4.0 Files created: package.json — type:module, Node 22 engine, pinned deps tsconfig.json — ES2022 target, NodeNext module resolution flue.config.ts — defineConfig({ target: 'node' }) .flue/agents/ — placeholder for agent handlers (T02-T04) .flue/roles/ — placeholder for role markdown (T02-T04) .gitignore — added node_modules/ and dist/ Verification: - npm ci: clean install, 361 packages, no errors - npx tsc --noEmit: passes (flue.config.ts compiles cleanly) - npx flue --help: CLI responds correctly .agents/skills/ auto-discovery check: Flue discovers .agents/skills/ at runtime only when agent code explicitly calls session.skill(). The 30 SDK skills in this repo are NOT auto-injected into agent context — they are only loaded on-demand by name. The disable-model-invocation frontmatter flag is irrelevant to Flue's runtime (Flue does not parse it). No mitigation needed in T02-T04; the skills won't interfere with the skill-drift agents unless explicitly invoked. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Ports the SDK Skill Drift Detector agent from the existing gh-aw workflow (.github/workflows/skill-drift-check.md) to the Flue harness. - .flue/roles/detector.md — carries the full ported prompt verbatim: the SDK-to-repo-to-team mapping table, Steps 1-5 (gather PRs, filter, compare, decide, return), and the decision rules for create_pr vs create_issue vs skip. Instructs the agent to use the `gh` CLI for GitHub access (no MCP) and to never run git write commands — patches are computed as unified diffs and returned in the `patch` field. - .flue/agents/skill-drift-detector.ts — thin handler (~50 lines). Accepts an optional `{ since?: string }` payload for overriding the 7-day window. Initialises a local sandbox session with `anthropic/claude-opus-4-6`, delegates to the detector role, and returns Valibot-typed JSON: `{ actions: Action[], summary: string }` where Action is `create_pr | create_issue | skip`. The output schema is the contract for T05 (actuator). The handler itself does no GitHub writes — it only computes and returns the action list. This runs side-by-side with the existing gh-aw detector; no changes to .github/workflows/skill-drift-check.md or its lock file. Plan: .pi/plans/2026-05-12-flue-skill-drift/plan.md Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Mirror of T02 pattern: role carries the full prompt, handler is a thin orchestration shim (52 lines). - `.flue/roles/updater.md` — ported from `.github/agents/skill-updater.agent.md` with the 8-step drift-fix flow, 5-file knowledge-base loading instruction, targeted-updates guardrail, and verification block (`./scripts/build-skill-tree.sh --check`) - `.flue/agents/skill-drift-updater.ts` — accepts issue payload, runs with `anthropic/claude-opus-4-6`, returns structured UpdaterOutput metadata only - Output schema is the contract for T06 (actuator): skill, summary, files_changed, sdk_pr_references, optional skipped - Knowledge base loaded at runtime via the agent's read tool — not inlined - No git operations in handler or role; actuator handles commit/push/PR - Runs side-by-side with the existing Copilot custom-agent (gh-aw path) Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Mirrors the T02/T03 pattern (detector + updater): - .flue/agents/skill-creator.ts — workflow_dispatch handler with platform + prompt inputs; validates output with valibot CreatorOutput schema - .flue/roles/creator.md — 6-phase creator workflow ported from .github/agents/skill-creator.agent.md Output schema returns metadata only (files_created, files_modified, router_updated, skill, platform, summary, skipped). No git operations in the handler or role — the actuator (T07) handles commit/push/PR. Key behaviours in the role: - Existence check first (skips if SDK or skill already exists) - Loads 5 knowledge-base files at the start of every run - Requires updating the sentry-sdk-setup router table before validation - ./scripts/build-skill-tree.sh --check must pass; failure sets skipped - SDK-to-repo mapping table carried over from the Copilot agent Parallel run: .github/agents/skill-creator.agent.md stays untouched. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Adds .github/workflows/flue-skill-drift-detector.yml — a two-job GitHub Actions workflow that runs the Flue skill-drift-detector agent and applies its output actions (create_pr / create_issue / skip). Key design points: - Cron trigger: Monday 22:42 UTC (42 22 * * 1), matching gh-aw cadence, plus workflow_dispatch with optional `since` date input - Two-job split: `detect` is read-only (contents: read) and runs the agent; `actuate` has write permissions (contents, pull-requests, issues) to apply the results - Protected-files enforcement: if a proposed patch touches package.json, lockfiles, tsconfig.json, flue.config.ts, AGENTS.md, CLAUDE.md, SKILL_TREE.md, scripts/build-skill-tree.sh, .github/, .agents/, or .flue/, the actuator downgrades the action from PR to issue - Patch apply uses `git apply --check` first; failures are counted and logged without aborting the run - Runs side-by-side with existing gh-aw detector (different name + concurrency group: flue-skill-drift-detector) - Does not touch skill-drift-check.md/.lock.yml or skill-drift-assign-reviewers.yml (which already handles label-based reviewer assignment for any PR with skill-drift label) Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Adds .github/workflows/flue-skill-drift-updater.yml — the GitHub Actions workflow that processes skill-drift issues opened by the Detector. Key design points: - Triggers on issues.labeled/opened with 'skill-drift' label, replacing the gh-aw 'assignees: [copilot]' mechanism; also supports workflow_dispatch - Two-job split: read-only 'update' job runs the Flue agent and captures a git patch artifact; write-permissioned 'actuate' job applies it and opens a PR - Patch-based artifact handoff (git diff --cached > changes.patch) between jobs, mirroring T05's detector pattern - Protected-files gate in actuator blocks commits to lock files, config, scripts, and .github/** — same regex as flue-skill-drift-detector.yml - Skill-tree validator: regenerates SKILL_TREE.md then runs --check; bails with an issue comment on real validation failures - Commit message includes 'Closes #N' for auto-close on PR merge - Skipped agent results post a comment on the originating issue - Concurrency keyed on issue number so parallel issues don't race Runs side-by-side with the existing Copilot custom-agent workflow. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

workflow_dispatch trigger (manual-only) with platform + prompt inputs. Two-job split mirroring Detector/Updater: - `create` job (read-only, 90min timeout): runs the skill-creator Flue agent, captures result.json + changes.patch as artifacts - `actuate` job (write, 15min): downloads artifacts, applies patch, runs protected-files gate and skill-tree validator, commits and opens PR Key design decisions: - Concurrency group keyed on platform to prevent parallel runs for the same platform - Protected-files violations open an issue instead of silently failing - Skill-tree validator failure also opens an issue with stderr output - `skill-drift` label applied so the reviewer-assign workflow fires - PR title uses feat(<scope>) (no [skill-drift] prefix — creator action) - Branch sanitizes platform input (lowercase, non-alnum -> dash) Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Three interactive shell scripts for local development testing of the three Flue agents: - scripts/test-flue-detector.sh — runs skill-drift-detector with a configurable 'since' window - scripts/test-flue-updater.sh — runs skill-drift-updater from a fixture file or a real GH issue (--issue N) - scripts/test-flue-creator.sh — runs skill-creator with a platform arg and optional prompt Each script: - Checks ANTHROPIC_API_KEY and GH_TOKEN/GITHUB_TOKEN before proceeding - Prominently warns about API costs (Detector: $0.20-$1.00, Creator: $2-$10) - Prompts for confirmation before invoking the live model - Saves output to /tmp/flue-<agent>-result.json and pretty-prints via jq - Validates the result against the expected output schema (PASS/FAIL) Also adds scripts/fixtures/flue-updater-issue.json — a realistic but clearly fake drift issue (issue #9999, PR #99999) for offline testing of the Updater. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Verified that skill-drift-assign-reviewers.yml fires correctly on PRs opened by all Flue agents (Detector, Updater, Creator): - Trigger: on.pull_request.types[opened] + paths[skills/sentry-*-sdk/**] matches all Flue-opened PRs since they all modify skill files. - Label filter: all three Flue workflows apply --label "skill-drift" to their PRs (Creator uses it at line 263 of flue-skill-creator.yml). - SKILL_TEAMS map: covers all 19 current skills in skills/sentry-*-sdk/ — 100% match, no gaps for existing platforms. - No-op path: script logs and exits cleanly when no matching skill dir found (safe for brand-new platforms created by Flue Creator). - Permissions: pull-requests:write is sufficient for requestReviewers. No code changes needed. Added a top-of-file comment documenting the source-agnostic behavior and the brand-new-platform no-op case. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Adds a short Flue Subproject section describing the file layout (.flue/agents, .flue/roles, package.json, etc.) and how to run the agents locally with npx flue run. Placed before the Skill Tree Navigation section. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

…cfcf-g5fm) Advisory: GHSA-3q49-cfcf-g5fm (Critical, all versions, no patched version exists) The dep chain @flue/sdk@0.5.3 → @mariozechner/pi-ai@0.73.1 → @mistralai/mistralai@2.2.1 pulls in a malicious package. All versions of @mistralai/mistralai are flagged with no upstream fix available. The package has no install hooks so npm ci itself is safe — the risk is dormant and only triggered if pi-ai's lazy import('./mistral.js') fires (i.e., if a Mistral model is invoked). Our three agents all hardcode anthropic/claude-opus-4-6, so Mistral never loads under current code paths. This mitigation eliminates the latent risk entirely. A postinstall script now physically removes the @mistralai directory from node_modules after every npm install or npm ci. Both the specific package dir and the @mistralai scope dir are removed in case other packages from that scope are pulled in later. Note: npm audit will continue to report this advisory because it reads the lockfile, not disk. This is expected and documents the upstream issue. The fix is a workaround pending upstream Flue/pi-ai dropping the dependency — track at: - https://github.com/mariozechner/pi-ai/issues (drop @mistralai/mistralai dep) - https://github.com/badlogic/flue/issues (upgrade to patched pi-ai once available) Co-Authored-By: claude-sonnet-4-5 <claude-sonnet-4-5@anthropic.com>

All three role files (.flue/roles/detector.md, updater.md, creator.md) contained the literal substring 'MCP' in constraints that prohibit using MCP servers. While the intent was correct, the presence of the substring caused the plan's grep contract to fail. Rewrote each constraint positively, replacing 'Do NOT use any MCP server or external GitHub integration' with 'Use the gh CLI for all GitHub access. Do not connect to external services for GitHub operations.' The semantic meaning is preserved — agents are still instructed to use gh CLI only. The grep contract (grep -ri "mcp" .flue/roles/ returns zero matches) is now satisfied. Addresses P0 #3 from the review at .pi/plans/2026-05-12-flue-skill-drift/review.md. Co-Authored-By: claude-sonnet-4-5 <claude-sonnet-4-5@anthropic.com>

The original schemas marked `skipped` as an optional field while requiring all success fields (skill, summary, files_changed, etc.). This meant Valibot rejected legitimate skip-only responses because the required fields were absent. Changes: - Replaced UpdaterOutput flat object with v.union([UpdaterSuccess, UpdaterSkipped]) discriminated by status: 'success' | 'skipped' literals - Replaced CreatorOutput flat object with v.union([CreatorSuccess, CreatorSkipped]) with the same discriminant pattern - Updated flue-skill-drift-updater.yml: skip detection now checks .status == 'skipped' and reads .reason; actuate job if: clause branches on .status == 'success' - Updated flue-skill-creator.yml: removed the skipped step output (multiline hazard); now emits status= only; actuate if: clause uses needs.create.outputs.status == 'success' - Lightly reworded Output sections in updater.md and creator.md to describe the new discriminated-union shape instead of the optional skipped field Fixes P0 #2 from the review at .pi/plans/2026-05-12-flue-skill-drift/review.md. Co-Authored-By: Claude claude-sonnet-4-5 via pi worker agent

The happy-path `gh pr create` call had `--label` flags placed after the heredoc EOF terminator. In bash, anything after the heredoc terminator is parsed as a separate command, so the flags were silently dropped — the PR opened with no labels, which meant the reviewer-assign workflow never fired. Fixed by writing the PR body to a temp file (/tmp/pr-body.md) with a standalone `cat > ... <<EOF` block, then calling `gh pr create` as a normal argument-style command with all flags on the same logical line. Also removed three references to the non-existent `skill-creator` label: - one in the happy-path `gh pr create` call - one in the protected-files violation `gh issue create` call - one in the skill-tree validation failure `gh issue create` call These would have caused `gh` to exit non-zero when the workflows ran. Replaced the two issue-create labels with `skill-drift` (already in use by the other Flue workflows) to keep labelling consistent. Addresses the P1 finding in .pi/plans/2026-05-12-flue-skill-drift/review.md. Co-Authored-By: Claude (claude-sonnet-4-5 via Pi worker agent)

…c edge cases - Move inputs.* interpolations from bash run: blocks into env: blocks across all three workflows — Creator uses PLATFORM/PROMPT env vars, Updater uses GH_EVENT_NAME/ ISSUE_NUMBER_INPUT/ISSUE_NUMBER_EVENT/ISSUE_NUMBER. This is the canonical fix for GitHub Actions script-injection sinks (Warden FGH-435): template substitution now happens at the env layer, not inside bash, so shell metacharacters in user-supplied input are never executed. - Bump Updater's update job from issues: read to issues: write so the 'Post skip comment if agent skipped' step can call gh issue comment without a 403. The actuate job's issues: write was already correct and is unchanged. Flagged by Seer Bug Prediction and Cursor code review on PR #127. - Replace fixed EOF heredoc delimiters with random 32-hex delimiters via openssl rand -hex 16 when writing multi-line JSON to $GITHUB_OUTPUT across all three workflows. A bare EOF line in LLM-generated output (e.g. inside a summary field) would otherwise truncate the heredoc early and corrupt fromJSON() parsing in downstream jobs. Flagged by Seer on PR #127. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

…n BRT-8PC) Per Warden security review BRT-8PC: deny-lists are structurally weak for LLM-emitted patches — any path the agent emits that isn't explicitly listed slips through. The old regex also missed common sensitive paths: .husky/, .npmrc, Dockerfile, .env*, renovate.json, .changeset/, .devcontainer/, top-level *.sh, commitlint.config.*, vitest.config.*, eslint.config.*. New allow-list: only paths matching ^skills/ are accepted; everything else triggers the existing downgrade-to-issue path. This captures the invariant that agents are only supposed to edit skill files — protecting current paths, future paths, and paths nobody thought to enumerate. Also strips leading ./ prefix defensively before the pattern match, eliminating any doubt about ^ anchor bypass via ./prefixed paths. Updated docs/agent-port/04-flue-implementation.md §6 to describe the new allow-list approach and §12.5 noting the source of the change (PR #127 Warden review BRT-8PC). Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

…g protected files - Added `sentry-cloudflare-sdk` and `sentry-elixir-sdk` rows to mapping tables in all three role files (Cursor bot flagged 17/19 — `SKILL_TEAMS` has 19 entries). Cloudflare lives in the JavaScript monorepo (packages/cloudflare/, packages/core/); Elixir is its own repo (getsentry/sentry-elixir), no path filter. - Removed Creator role's instructions to run `build-skill-tree.sh` and modify `AGENTS.md`/`SKILL_TREE.md`. The workflow's actuator regenerates `SKILL_TREE.md` after the allowlist check, so the agent's regeneration was both redundant and harmful: any successful Creator run would be downgraded to an issue because its patch touched a protected file outside `skills/`. Updated the output schema example to omit `SKILL_TREE.md` from `files_modified`. - Replaced the Phase 5 `build-skill-tree.sh --check` verification in creator.md with a safe `grep` sanity-check against the router table. Updater's `--check` reference (read-only mode) is left intact. Detector has no `build-skill-tree` reference. - Net effect: Creator can now successfully open PRs end-to-end; Detector/Updater can identify drift on Cloudflare and Elixir SDKs. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

sentry · 2026-05-12T16:13:04Z

+          if ! ./scripts/build-skill-tree.sh --check 2>/tmp/skill-tree-err; then
+            echo "::error::Skill tree validation failed after regeneration"
+            ERR=$(cat /tmp/skill-tree-err)
+            gh issue comment "$ISSUE_NUMBER" \


Bug: The workflow captures stderr from build-skill-tree.sh, but the script writes its errors to stdout, resulting in empty error reports on validation failure.
_{Severity: MEDIUM}

Suggested Fix

Modify the command in the workflow to capture both stdout and stderr. Change ./scripts/build-skill-tree.sh --check 2>/tmp/skill-tree-err to ./scripts/build-skill-tree.sh --check &> /tmp/skill-tree-err. This will redirect both streams to the file, ensuring the $ERR variable contains the actual error messages.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: .github/workflows/flue-skill-drift-updater.yml#L189-L192 Potential issue: The `build-skill-tree.sh` script writes its validation error messages to `stdout`. However, the calling GitHub workflow only captures `stderr` by using the redirection `2>/tmp/skill-tree-err`. When the script fails due to validation errors, the captured `$ERR` variable is empty. Consequently, the GitHub issue comment created to report the failure contains an empty code block, providing no actionable information for developers to debug why the skill tree validation failed.

Also affects:

.github/workflows/flue-skill-creator.yml:188~191

sentry · 2026-05-12T16:13:04Z

+
+            # Protected files check
+            local touched
+            touched=$(git diff --name-only)


Bug: The allowlist check uses git diff --name-only, which does not detect new untracked files. This could allow a patch to create and commit files outside the allowed directory.
_{Severity: MEDIUM}

Suggested Fix

Replace git diff --name-only with a command that can detect all changes, including untracked files. Use git status --porcelain and parse its output to get a list of all modified, staged, and untracked files to ensure the allowlist check is comprehensive.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: .github/workflows/flue-skill-drift-detector.yml#L123 Potential issue: The workflow's security check uses `git diff --name-only` to identify modified files and verify they are within an allowed directory. However, this command does not list new, untracked files. A patch applied via `git apply` can create new files. If a patch creates a new, untracked file outside the allowed `skills/` directory, it will bypass this security check. The subsequent `git add -A` command will then stage and commit this untracked file, potentially introducing unauthorized code.

…LI tools Removed the Updater and Skill Creator GitHub Action workflow files because these agents are invoked locally by humans via smoke scripts and manual PR flow. .flue TypeScript handlers and role markdowns remain unchanged and continue to be the authoritative implementations. Invocation now goes through ./scripts/test-flue-updater.sh and ./scripts/test-flue-creator.sh, while CI will only keep detector-driven automation handled separately. This commit only addresses Solo todo #360. Re-architecture of detector workflow behavior is deferred to follow-up todos #361 and #362. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi),

- Updated Detector input payload to: { skill_name, sdk_repo, pr_number, pr_url, sdk_repo_path }. - Removed per-action skill field from DetectorOutput and simplified output contracts to full-run single-skill scope. - Rewrote detector role for one merged PR flow: removed 19-row mapping table, 7-day date-window logic, and monorepo path-filter framing. - Kept duplicate-check guidance tied to open skill-drift PRs/issues in getsentry/sentry-for-ai and lowered action caps to 5 create_pr / 5 create_issue. - Updated flue detector smoke test to accept <skill_name> <sdk_repo> <pr_number> [+sdk_repo_path], added --fixture mode, and added new fixture at scripts/fixtures/flue-detector-pr.json. - This is U02 of the Detector single-PR rearchitecture; U03 (reusable workflow and repo wrappers) lands next. - Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

Removed standalone flue-skill-drift-detector.yml (cron-driven, centralized). Added reusable workflow flue-skill-drift-detector-reusable.yml with workflow_call inputs for skill+SDK PR context. Used GitHub App token for cross-repo write operations in getsentry/sentry-for-ai; no GITHUB_TOKEN writes. Added 19 example caller wrappers under docs/agent-port/sdk-repo-wrappers/ and onboarding README. Wrappers use PR closed/merged trigger on target SDK repos with noise filtering. Per-PR flow references PR metadata in generated PR/issue titles and bodies. Reference Solo todo #362. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

…r inverted architecture Rewrite 04-flue-implementation.md architecture sections for the inverted Flue flow (per-SDK wrapper -> reusable workflow), including diagram, mapping table, file layout, detector schema, local run guidance, cutover plan, risks, open questions, and new SDK repo onboarding section. Update AGENTS.md Flue subproject section to describe the new architecture: reusable detector workflow in this repo, local-only Updater/Creator CLI invocation, and onboarding via 19 wrapper templates under docs/agent-port/sdk-repo-wrappers/. Update PR #127 body via gh pr edit 127 to mirror the inverted architecture and local-first Updater/Creator path. Closes #363. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

sentry · 2026-05-20T09:12:14Z

+  PAYLOAD=$(jq -c \
+    --arg skill_name "$SKILL_NAME" \
+    --arg sdk_repo "$SDK_REPO" \
+    --argjson pr_number "$PR_NUMBER" \
+    --arg pr_url "https://github.com/${SDK_REPO}/pull/${PR_NUMBER}" \
+    --arg sdk_repo_path "$SDK_REPO_PATH" \
+    '{skill_name:$skill_name,sdk_repo:$sdk_repo,pr_number:$pr_number,pr_url:$pr_url,sdk_repo_path:$sdk_repo_path}')


Bug: The jq command in test-flue-detector.sh is missing the -n flag, causing the script to hang indefinitely when run without the --fixture option.
_{Severity: HIGH}

Suggested Fix

Add the -n flag to the jq command on line 47 to prevent it from reading from standard input. Change jq -c \ to jq -c -n \.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: scripts/test-flue-detector.sh#L47-L53 Potential issue: The `scripts/test-flue-detector.sh` script will hang indefinitely when invoked in its primary, documented, non-fixture mode. The `jq` command on line 47 is called without the `-n` flag and without any piped input or input file. This causes `jq` to wait for input from stdin, which is never provided within the `$(...)` command substitution. As a result, the script execution stalls and never completes.

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit ec41145. Configure here.}

cursor · 2026-05-20T09:13:47Z

+        run: |
+          set -euo pipefail
+
+          RESULT="artifact/result.json"


Actuate job reads artifact from wrong path

High Severity

The download-artifact step (no working-directory) places result.json at $GITHUB_WORKSPACE/artifact/result.json. The "Apply actions" step runs with working-directory: skills-repo and sets RESULT="artifact/result.json", resolving to $GITHUB_WORKSPACE/skills-repo/artifact/result.json — a path that never exists. Under set -euo pipefail, jq will fail immediately and no actions are ever applied.

Additional Locations (1)

.github/workflows/flue-skill-drift-detector-reusable.yml#L143-L148

^{Reviewed by Cursor Bugbot for commit ec41145. Configure here.}

cursor · 2026-05-20T09:13:47Z

+              git switch main
+              git branch -D "$branch_full"
+              local issue_body
+              issue_body="The Detector proposed a change to a protected path (\`$violation\`) for skill \`$SKILL_NAME\`.\n\nOriginal title: $title\n\nOriginal body:\n\n$body\n\nTouched paths:\n\n\`\`\`\n$touched\n\`\`\`\n\nDetected during merge of [${SDK_REPO}#${PR_NUMBER}](${PR_URL})."


Downgrade issue body contains literal \n instead of newlines

Low Severity

The issue_body variable for the protected-path downgrade uses \n inside regular double quotes. Bash doesn't interpret \n as newlines in "..." strings — they stay literal. Use $'...\n...' quoting or printf (as add_reference_footer already does) to produce actual line breaks in the created GitHub issue.

^{Reviewed by Cursor Bugbot for commit ec41145. Configure here.}

cursor · 2026-05-20T09:13:47Z

+      - '**/__tests__/**'
+    paths:
+      - "packages/browser/**"
+      - "packages/core/**"


JavaScript wrappers use mutually exclusive path filters

High Severity

All 8 sentry-javascript-*.yml wrapper templates specify both paths-ignore and paths on the same pull_request event. GitHub Actions rejects this combination — these filters are mutually exclusive per event. Workflows copied from these templates will fail validation and never trigger. Use a single paths list with ! negation patterns instead (e.g., !**/*.md).

Additional Locations (2)

docs/agent-port/sdk-repo-wrappers/sentry-javascript-cloudflare.yml#L6-L17

docs/agent-port/sdk-repo-wrappers/sentry-javascript-nextjs.yml#L6-L19

^{Reviewed by Cursor Bugbot for commit ec41145. Configure here.}

…ppers, untracked allowlist, prompt-injection surface) Bug A: actuator now reads detector output from "${GITHUB_WORKSPACE}/artifact/result.json" to match download-artifact output location under workspace root.\nBug B: local smoke script adds `jq -c -n` for non-fixture payload construction to avoid stdin blocking.\nBug C: all sentry-javascript wrappers now use a single `paths:` list with GitHub Actions negation patterns (no `paths-ignore`) to satisfy workflow constraints.\nBug D: allow-list check now evaluates `git diff --name-only HEAD` plus untracked files from `git ls-files --others --exclude-standard` to prevent missing staged new files.\nBug E: dropped the `sdk_repo_path` input/payload path and removed local SDK checkout from the detector flow, reducing PR-controlled-path prompt-injection surface.\n\nCo-Authored-By: Claude (claude-opus-4-6 via Pi)

Adds on: workflow_dispatch alongside workflow_call with the same detector inputs for pre-production manual runs.\nMoves app-token creation out of detect so manual dispatch can run with only ANTHROPIC_API_KEY.\nSkips actuate on workflow_dispatch (only detect runs + result artifact), and adds visible result summarization for manual inspection.\nIncludes no behavior change for production workflow_call path, which still performs actuator-based PR/issue creation.\nReference: getstarted with pilot in getsentry/sentry-go#1308; this test hook is for manual validation before App secrets are fully in place.\n\nCo-Authored-By: Claude (claude-sonnet-4-6 via Pi)

The Flue CLI emits build progress messages ('[flue] Building:', '[flue] Source root:', etc.) to stdout BEFORE the agent's JSON result, not to stderr as the docs imply. The smoke script and the reusable workflow both captured raw stdout to result.json and then ran jq on it, which failed with 'parse error: Invalid literal at line 1, column 6'. Fix: capture the raw output to flue-output.log, then extract the trailing JSON object (everything from the first line starting with '{') into result.json via 'sed -n /^{/,$p'. The raw log is now uploaded as a separate artifact alongside result.json so we can debug build issues post-run. Verified locally: smoke run against getsentry/sentry-go#1302 now parses cleanly. The agent correctly emitted a single 'skip' action recognising that the removed ContextifyFrames integration was never user-facing and so doesn't create drift in the sentry-go-sdk skill. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)

sentry-warden · 2026-05-21T13:45:05Z

GHSA-3q49-cfcf-g5fm

+    "node": ">=22"
+  },
+  "scripts": {
+    "postinstall": "rm -rf node_modules/@mistralai/mistralai && rm -rf node_modules/@mistralai && echo 'Removed @mistralai/mistralai per GHSA-3q49-cfcf-g5fm (malware advisory)'"


Known-malware package executes before postinstall cleanup in CI, exposing GITHUB_TOKEN

The postinstall hook removing @mistralai/mistralai (GHSA-3q49-cfcf-g5fm) runs only after npm has already installed all packages and executed their own lifecycle scripts, so any malicious install hooks in that package fire first — during npm ci in the reusable workflow where GITHUB_TOKEN is available in the runner environment.

Evidence

package-lock.json confirms @mistralai/mistralai@2.2.1 as a resolved transitive dependency via node_modules/@mariozechner/pi-ai (line 1378 lists "@mistralai/mistralai": "^2.2.0" under its deps; package-lock line 1394 resolves it to 2.2.1).

npm's lifecycle execution order guarantees all dependency install scripts run before the root package's postinstall, so the malware can execute before rm -rf node_modules/@mistralai/mistralai removes it.

The reusable workflow flue-skill-drift-detector-reusable.yml runs npm ci (Install Flue step, detect job) without suppressing install scripts; GITHUB_TOKEN is available as an environment variable to all steps in that job, including during package installation.

The PR description explicitly acknowledges the mitigation is incomplete: "This PR still depends on the @mistralai/mistralai supply-chain mitigation work; keep the PR in Draft until the advisory requirement is fully satisfied."

The postinstall script does nothing to prevent execution of @mistralai/mistralai's own lifecycle hooks and provides a false sense of mitigation.

_{Identified by Warden security-review · 238-9D5}

HazAT added 9 commits May 12, 2026 11:19

sentry-warden Bot reviewed May 12, 2026

View reviewed changes

Comment thread .github/workflows/flue-skill-creator.yml Outdated

Comment thread .github/workflows/flue-skill-drift-detector.yml Outdated

HazAT and others added 5 commits May 12, 2026 17:36

HazAT force-pushed the flue/skill-drift-port branch from a009515 to 4ec438c Compare May 12, 2026 15:36

HazAT marked this pull request as ready for review May 12, 2026 15:45

sentry Bot reviewed May 12, 2026

View reviewed changes

Comment thread .github/workflows/flue-skill-drift-updater.yml Outdated

Comment thread .github/workflows/flue-skill-drift-detector.yml Outdated

cursor Bot reviewed May 12, 2026

View reviewed changes

Comment thread .github/workflows/flue-skill-drift-updater.yml Outdated

Comment thread .flue/roles/creator.md Outdated

Comment thread .flue/roles/detector.md Outdated

HazAT added 3 commits May 12, 2026 18:03

sentry Bot reviewed May 12, 2026

View reviewed changes

cursor Bot reviewed May 12, 2026

View reviewed changes

Comment thread .github/workflows/flue-skill-creator.yml Outdated

HazAT added 4 commits May 20, 2026 10:59

HazAT marked this pull request as draft May 20, 2026 09:09

sentry Bot reviewed May 20, 2026

View reviewed changes

cursor Bot reviewed May 20, 2026

View reviewed changes

sentry-warden Bot reviewed May 20, 2026

View reviewed changes

Comment thread .flue/agents/skill-drift-detector.ts Outdated

HazAT mentioned this pull request May 21, 2026

ci: pilot Sentry skill-drift detector getsentry/sentry-go#1308

Draft

HazAT added 3 commits May 21, 2026 15:24

sentry-warden Bot reviewed May 21, 2026

View reviewed changes

Uh oh!

Conversation

HazAT commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in this PR

Architecture

How to test locally

Pending follow-ups

Review findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sentry Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

sentry Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sentry Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 20, 2026

Choose a reason for hiding this comment

Actuate job reads artifact from wrong path

Uh oh!

cursor Bot May 20, 2026

Choose a reason for hiding this comment

Downgrade issue body contains literal \n instead of newlines

Uh oh!

cursor Bot May 20, 2026

Choose a reason for hiding this comment

JavaScript wrappers use mutually exclusive path filters

Uh oh!

Uh oh!

sentry-warden Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HazAT commented May 12, 2026 •

edited

Loading