Add autosolve actions for automated issue resolution#14
Add autosolve actions for automated issue resolution#14
Conversation
324a6db to
0655ce2
Compare
9134765 to
817680c
Compare
0655ce2 to
6f1121d
Compare
817680c to
0a678c6
Compare
6f1121d to
5c7a16f
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces a new autosolve Go-based automation tool and two composite GitHub Actions (autosolve/assess and autosolve/implement) to assess issue suitability and implement fixes with Claude, including PR creation, security checks, and usage tracking.
Changes:
- Add Go implementation for assessment/implementation orchestration, prompt assembly, git/gh wrappers, and security checks.
- Add composite actions (
autosolve/assess,autosolve/implement) plus CI updates to run Go tests and validate the precompiled binary. - Add prompt templates and unit tests for core functionality.
Reviewed changes
Copilot reviewed 28 out of 30 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| autosolve/internal/security/security_test.go | Adds unit tests for blocked-path and sensitive-file enforcement and .gitignore warnings. |
| autosolve/internal/security/security.go | Implements blocked path checks, sensitive filename/extension detection, and symlink-to-blocked-path detection. |
| autosolve/internal/prompt/templates/security-preamble.md | Adds system preamble intended to constrain the model’s behavior for safety. |
| autosolve/internal/prompt/templates/implementation-footer.md | Adds implementation-phase instruction footer and required success/fail marker. |
| autosolve/internal/prompt/templates/assessment-footer.md | Adds assessment-phase instruction footer and required proceed/skip marker. |
| autosolve/internal/prompt/prompt_test.go | Adds tests for prompt construction, skill file inclusion, and custom criteria. |
| autosolve/internal/prompt/prompt.go | Implements prompt assembly from templates + task inputs. |
| autosolve/internal/implement/implement_test.go | Adds tests for retry logic, output writing, and summary extraction. |
| autosolve/internal/implement/implement.go | Implements the implementation phase: retries, security checks, staging/commit/push, PR creation, and AI security review. |
| autosolve/internal/github/github.go | Adds a gh-CLI-backed GitHub client for comments/labels/PR creation. |
| autosolve/internal/git/git.go | Adds a git CLI abstraction and helper to list changed files. |
| autosolve/internal/config/config_test.go | Adds tests for config parsing/validation and blocked path parsing. |
| autosolve/internal/config/config.go | Adds config loading/validation from action inputs and auth validation. |
| autosolve/internal/claude/claude_test.go | Adds tests for extracting markers/session IDs and usage tracking. |
| autosolve/internal/claude/claude.go | Adds Claude CLI runner + result parsing + usage tracking persistence. |
| autosolve/internal/assess/assess_test.go | Adds tests for assessment flow and summary extraction. |
| autosolve/internal/assess/assess.go | Implements assessment phase invocation and outputs/summary writing. |
| autosolve/internal/action/action_test.go | Adds tests for GitHub Actions output and step summary helpers. |
| autosolve/internal/action/action.go | Adds helpers for outputs, summaries, and workflow annotations. |
| autosolve/implement/action.yml | Defines the composite action to run autosolve implement and expose outputs. |
| autosolve/go.mod | Introduces the autosolve Go module definition. |
| autosolve/cmd/autosolve/main.go | Adds CLI entrypoint for assess and implement commands. |
| autosolve/build.sh | Adds cross-compile script producing the committed Linux binary. |
| autosolve/assess/action.yml | Defines the composite action to run autosolve assess and expose outputs. |
| autosolve/Makefile | Adds build/test/clean targets for local development and CI. |
| autosolve/.gitignore | Ignores the local dev binary output. |
| CHANGELOG.md | Documents the addition of the autosolve actions. |
| .github/workflows/test.yml | Updates CI to run Go tests and ensure the precompiled binary is up to date. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5c7a16f to
a9a9010
Compare
1abbbb0 to
6fd24ba
Compare
f818651 to
6bc6bc5
Compare
|
One thing I'm running into here is that build the action and committing it each time easily gets out of date and is annoying. I'm going to look into alternatives. |
d06e466 to
f2ef7a1
Compare
There was a problem hiding this comment.
Left a couple of comments. Most of them are smaller/questions.
Also, here's some feedback I didn't know where to put:
- In the PR description I see
Precompiled Go binary (no Go toolchain needed at runtime)and one of the bottom checkboxes also mentions precompiled go binary. I'm assuming this just hasn't been updated right? I see that we actually recompile the binary every time this action is run - We should add some documentation in the README
981e842 to
a89fb9b
Compare
- Instruct Claude to never include secret values in responses — describe findings by file and line number instead. - Skip logging security review output since it may reference secrets found in the diff. - Log Claude output in collapsible ::group:: blocks in the step log, gated by a verbose_logging input (default false). Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
JSON output in ::group:: log sections is now indented for readability in the GitHub Actions log viewer. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
The assess step only permits Read,Grep,Glob — no Bash — so Claude could not read context_vars from the environment. Add a scoped Bash(printenv VAR) permission for each declared context var and update the prompt to tell Claude to use `printenv`. fixup bdb5ba4 Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
20e41e0 to
9b1b16c
Compare
git symbolic-ref refs/remotes/origin/HEAD prints a fatal error when origin/HEAD is not configured (common with actions/checkout persist-credentials: false). Rather than suppress the error, default pr_base_branch to "main" in the action input and remove the auto-detection code entirely. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
The workspace root is the actions repo checkout, not the target repo, so setup-go can't find go.mod for cache hashing. Caching isn't needed since the autosolve binary is a one-off build. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Previously the implement binary exited 0 even on failure (all retries exhausted, security check failed, PR creation failed), making the step appear green. Now it writes outputs first so subsequent workflow steps can still read them, then returns an error so the step is correctly marked as failed. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
9b1b16c to
555a3c5
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 28 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func ParseBlockedPaths(raw string) []string { | ||
| paths := make(map[string]bool) | ||
| for _, p := range requiredBlockedPaths { | ||
| paths[p] = true | ||
| } | ||
| for _, p := range strings.Split(raw, ",") { | ||
| p = strings.TrimSpace(p) | ||
| if p != "" { | ||
| paths[p] = true | ||
| } | ||
| } | ||
| var result []string | ||
| // Required paths first for consistent ordering | ||
| for _, p := range requiredBlockedPaths { | ||
| result = append(result, p) | ||
| } | ||
| for p := range paths { | ||
| if !contains(requiredBlockedPaths, p) { | ||
| result = append(result, p) | ||
| } | ||
| } | ||
| return result |
There was a problem hiding this comment.
ParseBlockedPaths builds result by iterating over a map (for p := range paths), which makes the ordering of non-required blocked paths nondeterministic. This can lead to unstable prompts/logs and can make tests flaky (current tests assert a specific order). Consider collecting the non-required paths into a slice and sorting it before appending.
| // LogResult records usage for a Claude invocation and logs token counts. | ||
| // When verbose is true, the full output is written to a collapsible group | ||
| // in the step log. Call immediately after runner.Run and before checking | ||
| // the error so that usage is captured even on failure. | ||
| func LogResult( | ||
| tracker *claude.UsageTracker, result *claude.Result, section, outputFile string, verbose bool, | ||
| ) { | ||
| tracker.Record(section, result.Usage) | ||
| LogInfo(fmt.Sprintf("%s usage: input=%d output=%d cost=$%.4f", | ||
| section, result.Usage.InputTokens, result.Usage.OutputTokens, result.Usage.CostUSD)) | ||
| if verbose { | ||
| logOutputGroup(section, outputFile) | ||
| } | ||
| } |
There was a problem hiding this comment.
LogResult assumes result is non-nil and dereferences result.Usage. Callers intentionally invoke this before checking the error from runner.Run, so a Runner implementation that returns (nil, err) would panic here. Consider handling result == nil defensively (skip usage logging or log a warning).
| // Build allowed tools: read-only plus printenv for each context var | ||
| // so Claude can read them without full Bash access. | ||
| tools := []string{"Read", "Grep", "Glob"} | ||
| for _, v := range cfg.ContextVars { | ||
| tools = append(tools, fmt.Sprintf("Bash(printenv %s)", v)) | ||
| } | ||
| allowedTools := strings.Join(tools, ",") |
There was a problem hiding this comment.
Context var names from cfg.ContextVars are interpolated directly into the --allowedTools string as Bash(printenv %s) without validation. A malicious value like FOO),Write,Edit could expand Claude tool permissions beyond the intended read-only set. Consider validating context var names against a strict env-var regex (e.g., ^[A-Z_][A-Z0-9_]*$) and rejecting/ignoring invalid entries.
| // Append printenv permissions for each context var so Claude can | ||
| // read them without unrestricted Bash access. | ||
| var extraTools []string | ||
| for _, v := range cfg.ContextVars { | ||
| extraTools = append(extraTools, fmt.Sprintf("Bash(printenv %s)", v)) | ||
| } | ||
| allowedTools := cfg.AllowedTools | ||
| if len(extraTools) > 0 { | ||
| allowedTools += "," + strings.Join(extraTools, ",") | ||
| } |
There was a problem hiding this comment.
Context var names from cfg.ContextVars are interpolated directly into the --allowedTools string as Bash(printenv %s) without validation. A malicious value like FOO),Write,Edit could expand Claude tool permissions beyond what the action intends. Consider validating context var names against a strict env-var regex and rejecting/ignoring invalid entries.
| // CheckGitignore logs a warning if the repo's .gitignore does not contain | ||
| // credential exclusion patterns. It does not modify the file — repo owners | ||
| // should add the patterns themselves for defense-in-depth. | ||
| func CheckGitignore(logWarning func(string)) { | ||
| data, err := os.ReadFile(".gitignore") | ||
| if err != nil { | ||
| logWarning("No .gitignore found. For defense-in-depth, add one with credential exclusion patterns: " + | ||
| strings.Join(gitignorePatterns, ", ")) | ||
| return | ||
| } |
There was a problem hiding this comment.
CheckGitignore reads .gitignore from the current working directory. Since the action supports a configurable working_directory, this can warn incorrectly (or miss the repo’s actual .gitignore) when invoked from a subdirectory. Consider resolving the repo root (e.g., git rev-parse --show-toplevel) and reading <repoRoot>/.gitignore.
| @@ -0,0 +1,3 @@ | |||
| module github.com/cockroachdb/actions/autosolve | |||
|
|
|||
| go 1.23.8 | |||
There was a problem hiding this comment.
The go directive typically uses major.minor (e.g., go 1.23) to indicate the language version. Using a patch version here (1.23.8) may be rejected by some Go toolchains and isn’t the usual way to pin a specific toolchain; if you need to pin, consider using a toolchain go1.23.8 directive instead.
| go 1.23.8 | |
| go 1.23 |
| - name: Set up Go | ||
| uses: actions/setup-go@v6 | ||
| with: | ||
| go-version-file: ${{ github.action_path }}/../go.mod | ||
| cache: false | ||
|
|
||
| - name: Build autosolve | ||
| shell: bash | ||
| run: go build -trimpath -o "$RUNNER_TEMP/autosolve" ./cmd/autosolve | ||
| working-directory: ${{ github.action_path }}/.. |
There was a problem hiding this comment.
PR description states a precompiled Go binary means no Go toolchain is needed at runtime, but this composite action always sets up Go and builds from source. Either add the same precompiled-binary fast-path used in autosolve/implement (skip Go setup/build when $RUNNER_TEMP/autosolve already exists) or adjust the documentation/description to match actual behavior.
| ForkRepo: os.Getenv("INPUT_FORK_REPO"), | ||
| ForkPushToken: os.Getenv("INPUT_FORK_PUSH_TOKEN"), | ||
| PRCreateToken: os.Getenv("INPUT_PR_CREATE_TOKEN"), | ||
| PRBaseBranch: os.Getenv("INPUT_PR_BASE_BRANCH"), | ||
| PRLabels: envOrDefault("INPUT_PR_LABELS", "autosolve"), | ||
| PRDraft: prDraft, | ||
| PullRequestTitle: os.Getenv("INPUT_PR_TITLE"), |
There was a problem hiding this comment.
PRBaseBranch is read directly from INPUT_PR_BASE_BRANCH without a default, unlike most other PR-related inputs. If the binary is run outside the composite action (or the env var is missing), PR creation will attempt to use an empty base branch and fail in a less obvious way. Consider defaulting this to main (or the repo’s default branch if discoverable) and/or validating it is non-empty when CreatePR is true.
| if cfg.Skill != "" { | ||
| content, err := os.ReadFile(cfg.Skill) | ||
| if err != nil { | ||
| return "", fmt.Errorf("reading skill file %s: %w", cfg.Skill, err) | ||
| } | ||
| b.Write(content) | ||
| b.WriteString("\n") | ||
| } |
There was a problem hiding this comment.
inputs.skill is documented as a path relative to the repo root, but Build reads it via os.ReadFile(cfg.Skill) relative to the current working directory. If the action runs with working_directory set to a subdir, this will fail to find the skill file (or read the wrong file). Consider resolving the git repo root and joining it with cfg.Skill (or documenting that it’s relative to working_directory).
linhcrl
left a comment
There was a problem hiding this comment.
Mostly small comments this time.
Also
- In the PR description I still see Precompiled Go binary (no Go toolchain needed at runtime) and one of the bottom checkboxes also mentions precompiled go binary.
- We should add some documentation in the README
| if command -v roachdev >/dev/null; then | ||
| printf '#!/bin/sh\nexec roachdev claude -- "$@"\n' > /usr/local/bin/claude | ||
| chmod +x /usr/local/bin/claude | ||
| echo "Claude CLI: using roachdev wrapper" |
There was a problem hiding this comment.
nit: would be nice to log the version used similar to the basic claude equivalent below. (Same for implement/action.yml)
| return result, nil | ||
| } | ||
|
|
||
| func TestRun_Proceed(t *testing.T) { |
There was a problem hiding this comment.
TestRun_Proceed and TestRun_Skip don't verify the actual assessment value. Proceed checks that output was written but not that it contains "assessment=PROCEED". Skip it has no assertions at all,
only checking that Run() doesn't error. The tests would pass even if the logic was broken.
| fmt.Sscanf(strings.TrimSpace(parts[2]), "%d", &s.Usage.InputTokens) | ||
| fmt.Sscanf(strings.TrimSpace(parts[3]), "%d", &s.Usage.OutputTokens) | ||
| fmt.Sscanf(strings.TrimSpace(parts[4]), "%d", &s.Usage.CacheCreationInputTokens) | ||
| fmt.Sscanf(strings.TrimSpace(parts[5]), "%d", &s.Usage.CacheReadInputTokens) | ||
| fmt.Sscanf(strings.TrimSpace(parts[6]), "$%f", &s.Usage.CostUSD) |
There was a problem hiding this comment.
I think we should at least log if any of these errors, otherwise the usage data will be corrupted without any indication of an error
| usage := Usage{CostUSD: out.TotalCostUSD} | ||
| if out.Usage != nil { | ||
| var u claudeUsage | ||
| if err := json.Unmarshal(out.Usage, &u); err == nil { |
There was a problem hiding this comment.
should we log here if we couldn't unmarshal usage data?
| } | ||
| } | ||
|
|
||
| func TestUsageTracker_RoundTrip(t *testing.T) { |
There was a problem hiding this comment.
maybe we can add another .Record() call with a duplicate name, e.g. security review, to see how it behaves in those scenarios
There was a problem hiding this comment.
Claude suggested adding a test for case insensitivity test
1. Case-sensitivity issue - Not tested
// On Linux: does .GITHUB/ bypass .github/ blocking?
Not sure if this is truly an issue
There was a problem hiding this comment.
Seems like it's missing coverage for the following crucial functions. Even if we mock some parts, do you think it would be possible to add somewhat meaningful tests for these?
- pushAndPR()
- aiSecurityReview()
The implement action previously always created PRs against GITHUB_REPOSITORY (the repo running the workflow). This adds a pr_target_repo input so workflows can target a different repository, enabling cross-repo scenarios like triggering from repo A but opening a PR in repo B. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Both assess and implement now check whether Go is already on the path before running setup-go. This avoids redundant Go installation when a prior workflow step has already set it up. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
7a64648 to
28fa74b
Compare
LoadSecurityConfig was intended for a security subcommand that was never implemented. The Log method on the git Client interface was never called. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
8c632cc to
508b7ea
Compare
Replace the boolean verbose_logging flag with a three-level log_level input (error/info/debug) that controls how much Claude CLI output streams to the GitHub Actions step log in real time. - error (default): silent, only errors and final status - info: pretty-printed result summary, permission denial warnings - debug: full stream including all tool calls, text, and results Permission denials are parsed from the result object and logged via ::warning:: annotations at info and debug levels. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Summary
Go implementation of composite actions for Claude-powered automated issue resolution:
autosolve/assess— Runs Claude in read-only mode to evaluate whether a task is suitable for automated resolutionautosolve/implement— Runs Claude to implement a solution, validates changes, runs AI security review, pushes to a fork, and creates a draft PRKey features
Testing
Tested end-to-end against cockroachlabs/ccloud-private-automation-testing.
Test plan
go test ./...passes