-
Notifications
You must be signed in to change notification settings - Fork 4.1k
256 lines (241 loc) · 11.8 KB
/
Copy pathinvestigate.yml
File metadata and controls
256 lines (241 loc) · 11.8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
# Investigate Test Failure
#
# Triggers when a collaborator comments `/investigate` on a test failure
# issue. Invokes Claude to autonomously analyze the failure and post
# findings as a comment.
#
# Manual testing via workflow_dispatch:
#
# Changes to this workflow (especially to permissions, allowed tools,
# or the agent prompt) should be reviewed by SecEng before testing or
# merging, as public-facing AI workflows require sign-off.
#
# Use --ref to point at a branch containing the workflow file:
#
# gh workflow run investigate.yml \
# --repo cockroachdb/cockroach \
# --ref your-branch-name \
# -f issue_number=163542
#
# When triggered via dispatch, findings are uploaded as a workflow
# artifact (visible in the run's "Artifacts" section) but not posted
# as a comment. The artifact is uploaded regardless of trigger type.
#
# To test on a personal fork (where Vertex AI OIDC is unavailable):
#
# 1. Add an ANTHROPIC_API_KEY repository secret to the fork. The
# workflow detects this and uses the API key directly instead of
# Vertex.
#
# 2. Copy the test failure issue to your fork (the agent reads the
# issue by number from the workflow's own repo):
#
# BODY=$(gh issue view 163542 --repo cockroachdb/cockroach --json body -q .body)
# gh issue create --repo <you>/cockroach --title "..." --body "$BODY"
#
# 3. The checkout is a blobless clone with full history, so git log
# and git blame work without deepening. The failure SHA must still
# be reachable from the fork's remote. Push it to a throwaway
# branch if needed:
#
# git push <your-fork-remote> <failure-sha>:refs/heads/investigate-sha
#
# 4. Trigger the workflow. Dispatch defaults to a cheaper model
# (Sonnet 4.5); add -f cheap=false for Opus 4.6:
#
# gh workflow run investigate.yml \
# --repo <you>/cockroach \
# --ref agent-workflow-investigate \
# -f issue_number=<fork-issue-number>
name: Investigate Test Failure
on:
issue_comment:
types: [created]
workflow_dispatch:
inputs:
issue_number:
description: 'Issue number to investigate'
required: true
comment_body:
description: 'Simulated trigger comment'
default: '/investigate'
cheap:
description: 'Use a cheaper model (claude-sonnet-4-5)'
type: boolean
default: true
smoke_test:
description: 'Run a tool smoke test instead of a real investigation'
type: boolean
default: false
jobs:
investigate:
if: >-
github.event_name == 'workflow_dispatch' ||
(github.event.issue.pull_request == null &&
(github.event.comment.body == '/investigate' ||
startsWith(github.event.comment.body, '/investigate ')) &&
(github.event.comment.author_association == 'COLLABORATOR' ||
github.event.comment.author_association == 'MEMBER' ||
github.event.comment.author_association == 'OWNER'))
runs-on: ubuntu-latest
timeout-minutes: 60
permissions:
contents: read
issues: write
id-token: write
env:
ISSUE_NUMBER: ${{ inputs.issue_number || github.event.issue.number }}
COMMENT_BODY: ${{ inputs.comment_body || github.event.comment.body }}
HAS_API_KEY: ${{ secrets.ANTHROPIC_API_KEY != '' }}
# Repository to check out the code from. Issues and comments use github.repository.
CODE_REPO: ${{ secrets.CODE_REPO }}
steps:
- name: Acknowledge trigger
if: github.event_name == 'issue_comment'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
gh api repos/${{ github.repository }}/issues/comments/${{ github.event.comment.id }}/reactions \
-f content=eyes
# Blobless clone: fetches the full commit graph (so git log,
# git blame, etc. work immediately) but defers downloading file
# contents until they're actually accessed. Much faster than a
# full clone of the cockroach repo while still giving the agent
# full history without manual deepening.
- name: Checkout repository
uses: actions/checkout@v5
with:
repository: ${{ env.HAS_API_KEY == 'true' && github.repository || env.CODE_REPO }}
token: ${{ env.HAS_API_KEY == 'true' && secrets.GITHUB_TOKEN || secrets.INVESTIGATE_PAT }}
filter: blob:none
fetch-depth: 0
- name: Create helper scripts
run: |
cat > /usr/local/bin/fetch-url <<'WRAPPER'
#!/bin/bash
set -euo pipefail
url="${1:?Usage: fetch-url URL [OUTPUT_FILE]}"
if [ -n "${2:-}" ]; then
exec curl -fsSL -o "$2" "$url"
else
exec curl -fsSL "$url"
fi
WRAPPER
chmod +x /usr/local/bin/fetch-url
# checkout-sha SHA — check out a commit in the blobless clone.
#
# A plain `git checkout <sha>` triggers a partial-clone lazy fetch
# of the commit's blobs (`want <blob-oid>`). GitHub's on-demand
# object serving rejects that with "not our ref" for commits not
# reachable from the checked-out repo's default branch (e.g.
# release-branch SHAs), which is most failure SHAs we investigate.
# To avoid that path, fetch the commit's complete snapshot in an
# isolated repo — there the request is `want <commit-sha>` with no
# haves, which GitHub serves reliably for any reachable commit —
# import those objects into this repo's object store, then check
# out locally with no further network access.
cat > /usr/local/bin/checkout-sha <<'WRAPPER'
#!/bin/bash
set -euo pipefail
sha="${1:?Usage: checkout-sha SHA}"
git_dir=$(git rev-parse --absolute-git-dir)
url=$(git config --get remote.origin.url)
auth=$(git config --get http.https://github.com/.extraheader || true)
tmp=$(mktemp -d)
trap 'rm -rf "$tmp"' EXIT
git -C "$tmp" init -q
if [ -n "$auth" ]; then
git -C "$tmp" config http.https://github.com/.extraheader "$auth"
fi
# fetch.unpackLimit=1 keeps the result as a packfile (instead of
# loose objects) regardless of object count, so the copy below has
# a stable source.
git -C "$tmp" -c fetch.unpackLimit=1 fetch -q --depth=1 "$url" "$sha"
cp "$tmp"/.git/objects/pack/pack-*.pack \
"$tmp"/.git/objects/pack/pack-*.idx \
"$git_dir/objects/pack/"
git -c advice.detachedHead=false checkout --force "$sha"
WRAPPER
chmod +x /usr/local/bin/checkout-sha
# Vertex AI auth for cockroachdb/cockroach. Skipped when an
# ANTHROPIC_API_KEY secret is set (e.g. on a personal fork).
- name: Authenticate to Google Cloud
if: env.HAS_API_KEY != 'true'
uses: 'google-github-actions/auth@7c6bc770dae815cd3e89ee6cdf493a5fab2cc093' # v3
with:
project_id: 'vertex-model-runners'
service_account: 'ai-review@dev-inf-prod.iam.gserviceaccount.com'
workload_identity_provider: 'projects/72497726731/locations/global/workloadIdentityPools/ai-review/providers/ai-review'
- name: Retrieve EngFlow certificates
if: env.HAS_API_KEY != 'true'
id: engflow-certs
run: |
CERT_DIR=$(mktemp -d)
if gcloud secrets versions access 2 --secret=engflow-mesolite-key --project=crl-github-actions > "$CERT_DIR/engflow.key" 2>/dev/null &&
gcloud secrets versions access 2 --secret=engflow-mesolite-crt --project=crl-github-actions > "$CERT_DIR/engflow.crt" 2>/dev/null; then
chmod 600 "$CERT_DIR/engflow.key" "$CERT_DIR/engflow.crt"
echo "ENGFLOW_CERT_FILE=$CERT_DIR/engflow.crt" >> "$GITHUB_ENV"
echo "ENGFLOW_KEY_FILE=$CERT_DIR/engflow.key" >> "$GITHUB_ENV"
echo "has_engflow=true" >> "$GITHUB_OUTPUT"
else
echo "::warning::Could not retrieve EngFlow certificates — EngFlow artifact access will be unavailable"
echo "has_engflow=false" >> "$GITHUB_OUTPUT"
rm -rf "$CERT_DIR"
fi
- name: Investigate
uses: cockroachdb/claude-code-action@v1
env:
ANTHROPIC_VERTEX_PROJECT_ID: ${{ env.HAS_API_KEY != 'true' && 'vertex-model-runners' || '' }}
CLOUD_ML_REGION: ${{ env.HAS_API_KEY != 'true' && 'global' || '' }}
# The checkout is a different repo than this one; point the
# agent's gh commands at this repo's issue tracker.
GH_REPO: ${{ github.repository }}
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
github_token: ${{ secrets.GITHUB_TOKEN }}
use_vertex: ${{ env.HAS_API_KEY != 'true' && 'true' || 'false' }}
# Permissions are passed via --allowedTools using the colon
# format (Bash(cmd:args)) because cockroachdb/claude-code-action@v1
# (Claude Code 2.0.1) ignores permissions set via the `settings`
# input — tools end up denied even though settings.json is written
# correctly. The newer space format (Bash(cmd args)) and settings-
# based permissions may work after upgrading the action.
claude_args: |
--model ${{ inputs.cheap == true && 'claude-sonnet-4-5' || 'claude-opus-4-6' }}
--allowedTools "Write,Read,Grep,Glob,WebFetch,Bash(cat:*),Bash(head:*),Bash(tail:*),Bash(grep:*),Bash(rg:*),Bash(awk:*),Bash(cut:*),Bash(tr:*),Bash(sort:*),Bash(uniq:*),Bash(wc:*),Bash(tee:*),Bash(diff:*),Bash(file:*),Bash(strings:*),Bash(jq:*),Bash(ls:*),Bash(find:*),Bash(tree:*),Bash(stat:*),Bash(du:*),Bash(mkdir:*),Bash(git:*),Bash(checkout-sha:*),Bash(gh issue view:*),Bash(gh issue list:*),Bash(gh pr view:*),Bash(gh pr list:*),Bash(gh pr diff:*),Bash(gh search:*),Bash(fetch-url:*),Bash(unzip:*),Bash(tar x*),Bash(tar -x*),Bash(tar --extract:*),Bash(go mod download:*),Bash(go env:*),Bash(.claude/skills/engflow-artifacts/run.sh:*),Bash(go tool pprof:*),Bash(go run ./pkg/cmd/tsdump2duck:*),Bash(duckdb:*)"
prompt: |
Read and follow the instructions in the prompt file
`.github/prompts/${{ inputs.smoke_test == true && 'investigate-smoke' || 'investigate' }}.md`.
ISSUE_REPO: ${{ github.repository }}
CODE_REPO: ${{ env.HAS_API_KEY == 'true' && github.repository || env.CODE_REPO }}
ISSUE NUMBER: ${{ env.ISSUE_NUMBER }}
TRIGGER COMMENT: ${{ env.COMMENT_BODY }}
WORKFLOW RUN: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
Use ISSUE_REPO for all gh issue/pr/search commands. Use
CODE_REPO when building source links (blob/permalink URLs).
- name: Upload findings
if: always()
uses: actions/upload-artifact@v4
with:
name: investigation-findings
path: artifacts/findings.md
if-no-files-found: ignore
- name: Post findings
if: always() && github.event_name == 'issue_comment'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
if [ -s artifacts/findings.md ]; then
gh issue comment "$ISSUE_NUMBER" \
--repo ${{ github.repository }} \
--body-file artifacts/findings.md
else
gh issue comment "$ISSUE_NUMBER" \
--repo ${{ github.repository }} \
--body "Investigation did not produce findings. Check the [workflow run](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}) for details."
fi
- name: Clean up EngFlow certificates
if: always() && steps.engflow-certs.outputs.has_engflow == 'true'
run: |
rm -f "$ENGFLOW_CERT_FILE" "$ENGFLOW_KEY_FILE"
rmdir "$(dirname "$ENGFLOW_CERT_FILE")" 2>/dev/null || true