Add autosolve actions for automated issue resolution by fantapop · Pull Request #14 · cockroachdb/actions

fantapop · 2026-03-25T00:25:39Z

Summary

Go implementation of composite actions for Claude-powered automated issue resolution:

autosolve/assess — Runs Claude in read-only mode to evaluate whether a task is suitable for automated resolution
autosolve/implement — Runs Claude to implement a solution, validates changes, runs AI security review, pushes to a fork, and creates a draft PR

Key features

Per-file batched AI security review with generated-file detection
Token usage tracking across phases with combined markdown summary
Retry logic with Claude session resumption
Skill file support for custom prompts

Testing

Tested end-to-end against cockroachlabs/ccloud-private-automation-testing.

Test plan

go test ./... passes
Precompiled binary check passes in CI

Copilot

Pull request overview

This PR introduces a new autosolve Go-based automation tool and two composite GitHub Actions (autosolve/assess and autosolve/implement) to assess issue suitability and implement fixes with Claude, including PR creation, security checks, and usage tracking.

Changes:

Add Go implementation for assessment/implementation orchestration, prompt assembly, git/gh wrappers, and security checks.
Add composite actions (autosolve/assess, autosolve/implement) plus CI updates to run Go tests and validate the precompiled binary.
Add prompt templates and unit tests for core functionality.

Reviewed changes

Copilot reviewed 28 out of 30 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
autosolve/internal/security/security_test.go	Adds unit tests for blocked-path and sensitive-file enforcement and `.gitignore` warnings.
autosolve/internal/security/security.go	Implements blocked path checks, sensitive filename/extension detection, and symlink-to-blocked-path detection.
autosolve/internal/prompt/templates/security-preamble.md	Adds system preamble intended to constrain the model’s behavior for safety.
autosolve/internal/prompt/templates/implementation-footer.md	Adds implementation-phase instruction footer and required success/fail marker.
autosolve/internal/prompt/templates/assessment-footer.md	Adds assessment-phase instruction footer and required proceed/skip marker.
autosolve/internal/prompt/prompt_test.go	Adds tests for prompt construction, skill file inclusion, and custom criteria.
autosolve/internal/prompt/prompt.go	Implements prompt assembly from templates + task inputs.
autosolve/internal/implement/implement_test.go	Adds tests for retry logic, output writing, and summary extraction.
autosolve/internal/implement/implement.go	Implements the implementation phase: retries, security checks, staging/commit/push, PR creation, and AI security review.
autosolve/internal/github/github.go	Adds a `gh`-CLI-backed GitHub client for comments/labels/PR creation.
autosolve/internal/git/git.go	Adds a git CLI abstraction and helper to list changed files.
autosolve/internal/config/config_test.go	Adds tests for config parsing/validation and blocked path parsing.
autosolve/internal/config/config.go	Adds config loading/validation from action inputs and auth validation.
autosolve/internal/claude/claude_test.go	Adds tests for extracting markers/session IDs and usage tracking.
autosolve/internal/claude/claude.go	Adds Claude CLI runner + result parsing + usage tracking persistence.
autosolve/internal/assess/assess_test.go	Adds tests for assessment flow and summary extraction.
autosolve/internal/assess/assess.go	Implements assessment phase invocation and outputs/summary writing.
autosolve/internal/action/action_test.go	Adds tests for GitHub Actions output and step summary helpers.
autosolve/internal/action/action.go	Adds helpers for outputs, summaries, and workflow annotations.
autosolve/implement/action.yml	Defines the composite action to run `autosolve implement` and expose outputs.
autosolve/go.mod	Introduces the autosolve Go module definition.
autosolve/cmd/autosolve/main.go	Adds CLI entrypoint for `assess` and `implement` commands.
autosolve/build.sh	Adds cross-compile script producing the committed Linux binary.
autosolve/assess/action.yml	Defines the composite action to run `autosolve assess` and expose outputs.
autosolve/Makefile	Adds build/test/clean targets for local development and CI.
autosolve/.gitignore	Ignores the local dev binary output.
CHANGELOG.md	Documents the addition of the autosolve actions.
.github/workflows/test.yml	Updates CI to run Go tests and ensure the precompiled binary is up to date.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fantapop · 2026-03-27T22:30:09Z

One thing I'm running into here is that build the action and committing it each time easily gets out of date and is annoying. I'm going to look into alternatives.

linhcrl

Left a couple of comments. Most of them are smaller/questions.

Also, here's some feedback I didn't know where to put:

In the PR description I see Precompiled Go binary (no Go toolchain needed at runtime) and one of the bottom checkboxes also mentions precompiled go binary. I'm assuming this just hasn't been updated right? I see that we actually recompile the binary every time this action is run
We should add some documentation in the README

- Instruct Claude to never include secret values in responses — describe findings by file and line number instead. - Skip logging security review output since it may reference secrets found in the diff. - Log Claude output in collapsible ::group:: blocks in the step log, gated by a verbose_logging input (default false). Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

JSON output in ::group:: log sections is now indented for readability in the GitHub Actions log viewer. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

The assess step only permits Read,Grep,Glob — no Bash — so Claude could not read context_vars from the environment. Add a scoped Bash(printenv VAR) permission for each declared context var and update the prompt to tell Claude to use `printenv`. fixup bdb5ba4 Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

git symbolic-ref refs/remotes/origin/HEAD prints a fatal error when origin/HEAD is not configured (common with actions/checkout persist-credentials: false). Rather than suppress the error, default pr_base_branch to "main" in the action input and remove the auto-detection code entirely. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

The workspace root is the actions repo checkout, not the target repo, so setup-go can't find go.mod for cache hashing. Caching isn't needed since the autosolve binary is a one-off build. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Previously the implement binary exited 0 even on failure (all retries exhausted, security check failed, PR creation failed), making the step appear green. Now it writes outputs first so subsequent workflow steps can still read them, then returns an error so the step is correctly marked as failed. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Copilot

Pull request overview

Copilot reviewed 27 out of 28 changed files in this pull request and generated 14 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T16:01:52Z

+func ParseBlockedPaths(raw string) []string {
+	paths := make(map[string]bool)
+	for _, p := range requiredBlockedPaths {
+		paths[p] = true
+	}
+	for _, p := range strings.Split(raw, ",") {
+		p = strings.TrimSpace(p)
+		if p != "" {
+			paths[p] = true
+		}
+	}
+	var result []string
+	// Required paths first for consistent ordering
+	for _, p := range requiredBlockedPaths {
+		result = append(result, p)
+	}
+	for p := range paths {
+		if !contains(requiredBlockedPaths, p) {
+			result = append(result, p)
+		}
+	}
+	return result


ParseBlockedPaths builds result by iterating over a map (for p := range paths), which makes the ordering of non-required blocked paths nondeterministic. This can lead to unstable prompts/logs and can make tests flaky (current tests assert a specific order). Consider collecting the non-required paths into a slice and sorting it before appending.

Copilot · 2026-04-09T16:01:53Z

+// LogResult records usage for a Claude invocation and logs token counts.
+// When verbose is true, the full output is written to a collapsible group
+// in the step log. Call immediately after runner.Run and before checking
+// the error so that usage is captured even on failure.
+func LogResult(
+	tracker *claude.UsageTracker, result *claude.Result, section, outputFile string, verbose bool,
+) {
+	tracker.Record(section, result.Usage)
+	LogInfo(fmt.Sprintf("%s usage: input=%d output=%d cost=$%.4f",
+		section, result.Usage.InputTokens, result.Usage.OutputTokens, result.Usage.CostUSD))
+	if verbose {
+		logOutputGroup(section, outputFile)
+	}
+}


LogResult assumes result is non-nil and dereferences result.Usage. Callers intentionally invoke this before checking the error from runner.Run, so a Runner implementation that returns (nil, err) would panic here. Consider handling result == nil defensively (skip usage logging or log a warning).

Copilot · 2026-04-09T16:01:53Z

+	// Build allowed tools: read-only plus printenv for each context var
+	// so Claude can read them without full Bash access.
+	tools := []string{"Read", "Grep", "Glob"}
+	for _, v := range cfg.ContextVars {
+		tools = append(tools, fmt.Sprintf("Bash(printenv %s)", v))
+	}
+	allowedTools := strings.Join(tools, ",")


Context var names from cfg.ContextVars are interpolated directly into the --allowedTools string as Bash(printenv %s) without validation. A malicious value like FOO),Write,Edit could expand Claude tool permissions beyond the intended read-only set. Consider validating context var names against a strict env-var regex (e.g., ^[A-Z_][A-Z0-9_]*$) and rejecting/ignoring invalid entries.

Copilot · 2026-04-09T16:01:53Z

+		// Append printenv permissions for each context var so Claude can
+		// read them without unrestricted Bash access.
+		var extraTools []string
+		for _, v := range cfg.ContextVars {
+			extraTools = append(extraTools, fmt.Sprintf("Bash(printenv %s)", v))
+		}
+		allowedTools := cfg.AllowedTools
+		if len(extraTools) > 0 {
+			allowedTools += "," + strings.Join(extraTools, ",")
+		}


Context var names from cfg.ContextVars are interpolated directly into the --allowedTools string as Bash(printenv %s) without validation. A malicious value like FOO),Write,Edit could expand Claude tool permissions beyond what the action intends. Consider validating context var names against a strict env-var regex and rejecting/ignoring invalid entries.

Copilot · 2026-04-09T16:01:55Z

+// CheckGitignore logs a warning if the repo's .gitignore does not contain
+// credential exclusion patterns. It does not modify the file — repo owners
+// should add the patterns themselves for defense-in-depth.
+func CheckGitignore(logWarning func(string)) {
+	data, err := os.ReadFile(".gitignore")
+	if err != nil {
+		logWarning("No .gitignore found. For defense-in-depth, add one with credential exclusion patterns: " +
+			strings.Join(gitignorePatterns, ", "))
+		return
+	}


CheckGitignore reads .gitignore from the current working directory. Since the action supports a configurable working_directory, this can warn incorrectly (or miss the repo’s actual .gitignore) when invoked from a subdirectory. Consider resolving the repo root (e.g., git rev-parse --show-toplevel) and reading <repoRoot>/.gitignore.

Copilot · 2026-04-09T16:01:55Z

@@ -0,0 +1,3 @@
+module github.com/cockroachdb/actions/autosolve
+
+go 1.23.8


The go directive typically uses major.minor (e.g., go 1.23) to indicate the language version. Using a patch version here (1.23.8) may be rejected by some Go toolchains and isn’t the usual way to pin a specific toolchain; if you need to pin, consider using a toolchain go1.23.8 directive instead.

Suggested change

go 1.23.8

go 1.23

Copilot · 2026-04-09T16:01:55Z

+    - name: Set up Go
+      uses: actions/setup-go@v6
+      with:
+        go-version-file: ${{ github.action_path }}/../go.mod
+        cache: false
+
+    - name: Build autosolve
+      shell: bash
+      run: go build -trimpath -o "$RUNNER_TEMP/autosolve" ./cmd/autosolve
+      working-directory: ${{ github.action_path }}/..


PR description states a precompiled Go binary means no Go toolchain is needed at runtime, but this composite action always sets up Go and builds from source. Either add the same precompiled-binary fast-path used in autosolve/implement (skip Go setup/build when $RUNNER_TEMP/autosolve already exists) or adjust the documentation/description to match actual behavior.

Copilot · 2026-04-09T16:01:56Z

+		ForkRepo:         os.Getenv("INPUT_FORK_REPO"),
+		ForkPushToken:    os.Getenv("INPUT_FORK_PUSH_TOKEN"),
+		PRCreateToken:    os.Getenv("INPUT_PR_CREATE_TOKEN"),
+		PRBaseBranch:     os.Getenv("INPUT_PR_BASE_BRANCH"),
+		PRLabels:         envOrDefault("INPUT_PR_LABELS", "autosolve"),
+		PRDraft:          prDraft,
+		PullRequestTitle: os.Getenv("INPUT_PR_TITLE"),


PRBaseBranch is read directly from INPUT_PR_BASE_BRANCH without a default, unlike most other PR-related inputs. If the binary is run outside the composite action (or the env var is missing), PR creation will attempt to use an empty base branch and fail in a less obvious way. Consider defaulting this to main (or the repo’s default branch if discoverable) and/or validating it is non-empty when CreatePR is true.

Copilot · 2026-04-09T16:01:56Z

+	if cfg.Skill != "" {
+		content, err := os.ReadFile(cfg.Skill)
+		if err != nil {
+			return "", fmt.Errorf("reading skill file %s: %w", cfg.Skill, err)
+		}
+		b.Write(content)
+		b.WriteString("\n")
+	}


inputs.skill is documented as a path relative to the repo root, but Build reads it via os.ReadFile(cfg.Skill) relative to the current working directory. If the action runs with working_directory set to a subdir, this will fail to find the skill file (or read the wrong file). Consider resolving the git repo root and joining it with cfg.Skill (or documenting that it’s relative to working_directory).

linhcrl

Mostly small comments this time.
Also

In the PR description I still see Precompiled Go binary (no Go toolchain needed at runtime) and one of the bottom checkboxes also mentions precompiled go binary.
We should add some documentation in the README

linhcrl · 2026-04-09T20:43:28Z

+        if command -v roachdev >/dev/null; then
+          printf '#!/bin/sh\nexec roachdev claude -- "$@"\n' > /usr/local/bin/claude
+          chmod +x /usr/local/bin/claude
+          echo "Claude CLI: using roachdev wrapper"


nit: would be nice to log the version used similar to the basic claude equivalent below. (Same for implement/action.yml)

linhcrl · 2026-04-09T21:12:16Z

+	return result, nil
+}
+
+func TestRun_Proceed(t *testing.T) {


TestRun_Proceed and TestRun_Skip don't verify the actual assessment value. Proceed checks that output was written but not that it contains "assessment=PROCEED". Skip it has no assertions at all,
only checking that Run() doesn't error. The tests would pass even if the logic was broken.

linhcrl · 2026-04-09T21:30:29Z

+		fmt.Sscanf(strings.TrimSpace(parts[2]), "%d", &s.Usage.InputTokens)
+		fmt.Sscanf(strings.TrimSpace(parts[3]), "%d", &s.Usage.OutputTokens)
+		fmt.Sscanf(strings.TrimSpace(parts[4]), "%d", &s.Usage.CacheCreationInputTokens)
+		fmt.Sscanf(strings.TrimSpace(parts[5]), "%d", &s.Usage.CacheReadInputTokens)
+		fmt.Sscanf(strings.TrimSpace(parts[6]), "$%f", &s.Usage.CostUSD)


I think we should at least log if any of these errors, otherwise the usage data will be corrupted without any indication of an error

linhcrl · 2026-04-09T21:32:31Z

+	usage := Usage{CostUSD: out.TotalCostUSD}
+	if out.Usage != nil {
+		var u claudeUsage
+		if err := json.Unmarshal(out.Usage, &u); err == nil {


should we log here if we couldn't unmarshal usage data?

linhcrl · 2026-04-09T21:39:33Z

+	}
+}
+
+func TestUsageTracker_RoundTrip(t *testing.T) {


maybe we can add another .Record() call with a duplicate name, e.g. security review, to see how it behaves in those scenarios

linhcrl · 2026-04-09T22:26:21Z

Claude suggested adding a test for case insensitivity test

1. Case-sensitivity issue - Not tested // On Linux: does .GITHUB/ bypass .github/ blocking?

Not sure if this is truly an issue

linhcrl · 2026-04-09T22:50:38Z

Seems like it's missing coverage for the following crucial functions. Even if we mock some parts, do you think it would be possible to add somewhat meaningful tests for these?

pushAndPR()

aiSecurityReview()

The implement action previously always created PRs against GITHUB_REPOSITORY (the repo running the workflow). This adds a pr_target_repo input so workflows can target a different repository, enabling cross-repo scenarios like triggering from repo A but opening a PR in repo B. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Both assess and implement now check whether Go is already on the path before running setup-go. This avoids redundant Go installation when a prior workflow step has already set it up. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

LoadSecurityConfig was intended for a security subcommand that was never implemented. The Log method on the git Client interface was never called. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Replace the boolean verbose_logging flag with a three-level log_level input (error/info/debug) that controls how much Claude CLI output streams to the GitHub Actions step log in real time. - error (default): silent, only errors and final status - info: pretty-printed result summary, permission denial warnings - debug: full stream including all tool calls, text, and results Permission denials are parsed from the result object and logged via ::warning:: annotations at info and debug levels. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

fantapop force-pushed the pr/autosolve-go branch from 324a6db to 0655ce2 Compare March 25, 2026 00:29

fantapop force-pushed the pr/shell-framework branch from 9134765 to 817680c Compare March 25, 2026 00:38

fantapop force-pushed the pr/autosolve-go branch from 0655ce2 to 6f1121d Compare March 25, 2026 00:38

fantapop force-pushed the pr/shell-framework branch from 817680c to 0a678c6 Compare March 25, 2026 01:00

fantapop force-pushed the pr/autosolve-go branch from 6f1121d to 5c7a16f Compare March 25, 2026 01:03

fantapop changed the base branch from pr/shell-framework to main March 25, 2026 01:03

This was referenced Mar 25, 2026

Generic autosolve github workflow for automated issue resolution #5

Closed

Add Go rewrite of autosolve actions #8

Closed

fantapop requested a review from Copilot March 25, 2026 01:24

Copilot started reviewing on behalf of fantapop March 25, 2026 01:25 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

fantapop force-pushed the pr/autosolve-go branch from 5c7a16f to a9a9010 Compare March 25, 2026 01:31

fantapop requested a review from linhcrl March 25, 2026 01:35

fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from 1abbbb0 to 6fd24ba Compare March 25, 2026 06:52

fantapop commented Mar 25, 2026

View reviewed changes

Comment thread .github/workflows/test.yml Outdated

fantapop force-pushed the pr/autosolve-go branch 6 times, most recently from f818651 to 6bc6bc5 Compare March 27, 2026 22:21

fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from d06e466 to f2ef7a1 Compare March 27, 2026 22:50

linhcrl reviewed Apr 2, 2026

View reviewed changes

fantapop force-pushed the pr/autosolve-go branch 4 times, most recently from 981e842 to a89fb9b Compare April 8, 2026 06:45

fantapop and others added 3 commits April 9, 2026 08:47

autosolve: Pretty-print JSON in collapsible log output

7d542de

JSON output in ::group:: log sections is now indented for readability in the GitHub Actions log viewer. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

fantapop force-pushed the pr/autosolve-go branch from 20e41e0 to 9b1b16c Compare April 9, 2026 15:48

fantapop and others added 3 commits April 9, 2026 08:50

fantapop force-pushed the pr/autosolve-go branch from 9b1b16c to 555a3c5 Compare April 9, 2026 15:51

fantapop requested a review from Copilot April 9, 2026 15:54

Copilot started reviewing on behalf of fantapop April 9, 2026 15:55 View session

fantapop requested a review from linhcrl April 9, 2026 15:57

Copilot AI reviewed Apr 9, 2026

View reviewed changes

linhcrl reviewed Apr 9, 2026

View reviewed changes

fantapop and others added 3 commits May 4, 2026 17:47

autosolve: Document assess and implement actions in README

de8f400

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from 7a64648 to 28fa74b Compare May 6, 2026 01:15

autosolve: Remove dead code (LoadSecurityConfig, git.Client.Log)

b8b523b

LoadSecurityConfig was intended for a security subcommand that was never implemented. The Log method on the git Client interface was never called. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

fantapop force-pushed the pr/autosolve-go branch 9 times, most recently from 8c632cc to 508b7ea Compare May 7, 2026 01:09

fantapop force-pushed the pr/autosolve-go branch from 508b7ea to 9adf4f9 Compare May 7, 2026 18:49

		@@ -0,0 +1,3 @@
		module github.com/cockroachdb/actions/autosolve

		go 1.23.8

Conversation

fantapop commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key features

Testing

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fantapop commented Mar 27, 2026

Uh oh!

linhcrl left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

linhcrl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

linhcrl Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

fantapop commented Mar 25, 2026 •

edited

Loading

linhcrl left a comment •

edited

Loading