-
Notifications
You must be signed in to change notification settings - Fork 2
Add autosolve actions for automated issue resolution #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fantapop
wants to merge
47
commits into
main
Choose a base branch
from
pr/autosolve-go
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
e323423
Add autosolve actions for automated issue resolution
fantapop 78d1dcc
Address PR review feedback (batch 1)
fantapop b02a463
Replace credential helper with GIT_ASKPASS for fork authentication
fantapop 11b3ba2
Fix marker extraction to use last occurrence in output
fantapop 3fe1297
Fail closed on symlink resolution errors in security check
fantapop a5502d2
Reset staged changes on all security review failures
fantapop 31b7e13
Add unit tests for pure helpers and require PR body file
fantapop 599c7a2
Validate boolean inputs with case-insensitive parsing
fantapop c91aab2
Simplify Claude Runner interface and remove dead code
fantapop 45e2c78
Propagate errors instead of swallowing them
fantapop 226b573
Mitigate prompt injection in AI security review
fantapop 6406103
Remove additional_instructions input
fantapop 96bc22e
Harden prompt injection defenses with context_vars and env filtering
fantapop 01b51e1
Always block .github/ in blocked paths
fantapop 0fa1d12
Fix action.yml issues found during integration testing
fantapop 09ccb21
Prevent sensitive data in Claude output logs
fantapop 7d542de
autosolve: Pretty-print JSON in collapsible log output
fantapop 34ec73c
autosolve: Allow assess to read context_vars via printenv
fantapop 4bc5058
autosolve: Default pr_base_branch to main and remove SymbolicRef
fantapop de8989b
autosolve: Disable Go module caching in setup-go
fantapop 555a3c5
autosolve: Exit non-zero when implementation fails
fantapop 750fc10
autosolve: Add pr_target_repo input for cross-repo PR creation
fantapop de8f400
autosolve: Document assess and implement actions in README
fantapop 0b3c1f5
autosolve: Skip setup-go when Go is already available
fantapop b8b523b
autosolve: Remove dead code (LoadSecurityConfig, git.Client.Log)
fantapop 9adf4f9
autosolve: Add log_level input with streaming output
fantapop 51c0306
autosolve: Fail fast when target branch already exists on fork
fantapop 8d5c71c
autosolve: Restrict security review to read-only Bash tools
fantapop b793a01
autosolve: Update Go version to 1.26
fantapop 69a15e3
autosolve: Handle nil results and surface parse errors
fantapop cf3a541
autosolve: Detect symlinks that escape the repo root
fantapop 2f46718
autosolve: Remove create_pr option, always create PRs
fantapop 86abffc
autosolve: Improve test coverage and documentation
fantapop 4084a28
autosolve: Log roachdev version when using wrapper
fantapop 5a2aa7a
autosolve: Document required inputs and fix skill path description
fantapop c863d36
autosolve: Make label creation best-effort and document token permiss…
fantapop d695b1b
autosolve: Resolve skill path relative to GITHUB_WORKSPACE
fantapop 7f6b59e
autosolve: Use descriptive names for Claude output files
fantapop e1d804a
autosolve: Add ErrEmptyResult sentinel and improve test assertions
fantapop d087c35
autosolve: Add security review tests and fix test artifact cleanup
fantapop a42f94b
autosolve: Refactor pushAndPR into focused functions
fantapop ca4d75e
autosolve: Update tests to use .github/ blocked path
fantapop 8174636
autosolve: Document full allowed_tools default in README
fantapop 00899a8
autosolve: Extract default assessment criteria to a linkable file
fantapop 9d8779a
autosolve: Remove pr_title and pr_body_template inputs
fantapop fc2ca19
autosolve: Document classic PAT scopes and simplify label guidance
fantapop efcefd5
autosolve: Fix unchecked errors and use slices.Contains
fantapop File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -108,6 +108,156 @@ determine whether a major, minor, or patch version bump is needed. | |
| - Returns empty `bump_type` when there are no unreleased changes | ||
| - Follows semantic versioning principles | ||
|
|
||
| ### autosolve/assess | ||
|
|
||
| Runs Claude in read-only mode to assess whether a task is suitable for automated | ||
| resolution. Claude evaluates the task against configurable criteria and returns a | ||
| PROCEED or SKIP decision with reasoning. | ||
|
|
||
| **Usage:** | ||
|
|
||
| ```yaml | ||
| - uses: cockroachdb/actions/autosolve/assess@v0 | ||
| with: | ||
| system_prompt: "Assess whether this issue can be resolved automatically." | ||
| context_vars: "ISSUE_TITLE,ISSUE_BODY" | ||
| env: | ||
| ISSUE_TITLE: ${{ github.event.issue.title }} | ||
| ISSUE_BODY: ${{ github.event.issue.body }} | ||
| ``` | ||
|
|
||
| **Inputs:** | ||
|
|
||
| | Name | Default | Description | | ||
| | --------------------- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | `claude_cli_version` | `2.1.79` | Claude CLI version to install (e.g. `2.1.79` or `latest`) | | ||
| | `system_prompt` | **one required** | Trusted instructions for Claude describing the task to assess. Do not embed untrusted user input here — use `context_vars` instead. At least one of `system_prompt` or `skill` is required. | | ||
| | `skill` | **one required** | Path to a skill/prompt file relative to `GITHUB_WORKSPACE`. At least one of `system_prompt` or `skill` is required. | | ||
| | `context_vars` | `""` | Comma-separated list of environment variable names to pass through to Claude for untrusted user input (e.g., issue titles/bodies) | | ||
| | `assessment_criteria` | [see default](autosolve/internal/prompt/templates/default-assessment-criteria.md) | Trusted criteria for the assessment. Do not embed untrusted user input. | | ||
| | `model` | `claude-opus-4-6` | Claude model ID | | ||
| | `blocked_paths` | `""` | Comma-separated path prefixes that cannot be modified (case-sensitive). `.github/` is always blocked. | | ||
| | `log_level` | `error` | Controls Claude output in the step log: `error` (status only), `info` (result summary, permission denial warnings), `debug` (stream everything). | | ||
| | `working_directory` | `.` | Directory to run in (relative to workspace root) | | ||
|
|
||
| **Outputs:** | ||
|
|
||
| | Name | Description | | ||
| | ------------ | ---------------------------------- | | ||
| | `assessment` | `PROCEED` or `SKIP` | | ||
| | `summary` | Human-readable assessment reasoning | | ||
| | `result` | Full Claude result text | | ||
|
|
||
| **Features:** | ||
|
|
||
| - Runs Claude in read-only mode (Read, Grep, Glob only) — no file modifications | ||
| - Safely passes untrusted user input via environment variables instead of prompt injection | ||
| - Supports custom assessment criteria or skill files | ||
| - Designed to gate the more expensive `autosolve/implement` step | ||
|
|
||
| ### autosolve/implement | ||
|
|
||
| Runs Claude to implement a solution, validates changes with a security review, | ||
| pushes to a fork, and creates a pull request. Includes retry logic, blocked-path | ||
| enforcement, sensitive file detection, and token usage tracking. | ||
|
|
||
| **Usage:** | ||
|
|
||
| ```yaml | ||
| - uses: cockroachdb/actions/autosolve/implement@v0 | ||
| with: | ||
| system_prompt: "Fix the issue described in the environment variables." | ||
| context_vars: "ISSUE_TITLE,ISSUE_BODY" | ||
| fork_owner: my-bot | ||
| fork_repo: my-repo-fork | ||
| fork_push_token: ${{ secrets.FORK_PAT }} | ||
| pr_create_token: ${{ secrets.PR_PAT }} | ||
| env: | ||
| ISSUE_TITLE: ${{ github.event.issue.title }} | ||
| ISSUE_BODY: ${{ github.event.issue.body }} | ||
| ``` | ||
|
|
||
| **Inputs:** | ||
|
|
||
| | Name | Default | Description | | ||
| | -------------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------------- | | ||
| | `claude_cli_version` | `2.1.79` | Claude CLI version to install (e.g. `2.1.79` or `latest`) | | ||
| | `system_prompt` | **one required** | Trusted instructions for Claude describing the task. Do not embed untrusted user input — use `context_vars`. At least one of `system_prompt` or `skill` is required. | | ||
| | `skill` | **one required** | Path to a skill/prompt file relative to `GITHUB_WORKSPACE`. At least one of `system_prompt` or `skill` is required. | | ||
| | `context_vars` | `""` | Comma-separated list of environment variable names to pass through to Claude for untrusted user input | | ||
| | `allowed_tools` | [see below](#allowed_tools-default) | Claude `--allowedTools` string | | ||
| | `model` | `claude-opus-4-6` | Claude model ID | | ||
| | `max_retries` | `3` | Maximum implementation attempts | | ||
| | `pr_target_repo` | `${{ github.repository }}` | Repository where the PR is created (`owner/repo`). Set this when the PR should target a different repo than the one running the workflow. | | ||
| | `pr_base_branch` | `main` | Base branch for the PR | | ||
| | `pr_labels` | `autosolve` | Comma-separated labels to apply to the PR | | ||
| | `pr_draft` | `true` | Whether to create the PR as a draft | | ||
| | `fork_owner` | **required** | GitHub username or org that owns the fork | | ||
| | `fork_repo` | **required** | Repository name of the fork | | ||
| | `fork_push_token` | **required** | PAT with `contents: write` on the fork repository | | ||
| | `pr_create_token` | **required** | PAT with `pull_requests: write` on the target repo (see [Token permissions](#token-permissions)) | | ||
| | `blocked_paths` | `""` | Comma-separated path prefixes that cannot be modified (case-sensitive). `.github/` is always blocked. | | ||
| | `git_user_name` | `autosolve[bot]` | Git author/committer name | | ||
| | `git_user_email` | `autosolve[bot]@users.noreply.github.com` | Git author/committer email | | ||
| | `branch_prefix` | `autosolve/` | Prefix for the branch name | | ||
| | `branch_suffix` | `""` | Suffix for branch name. Defaults to timestamp. | | ||
| | `commit_signature` | `Co-Authored-By: Claude <[email protected]>` | Signature line appended to commit messages | | ||
| | `pr_footer` | [see below](#pr_footer-default) | Footer appended to the PR body | | ||
| | `log_level` | `error` | Controls Claude output in the step log: `error` (status only), `info` (result summary, permission denial warnings), `debug` (stream everything). | | ||
| | `working_directory` | `.` | Directory to run in (relative to workspace root) | | ||
|
|
||
| <a id="allowed_tools-default"></a> | ||
| > Default `allowed_tools`: | ||
| > ``` | ||
| > Read,Write,Edit,Grep,Glob, | ||
| > Bash(git add:*),Bash(git status:*),Bash(git diff:*),Bash(git log:*),Bash(git show:*), | ||
| > Bash(go build:*),Bash(go test:*),Bash(go vet:*),Bash(make:*) | ||
| > ``` | ||
|
|
||
| <a id="pr_footer-default"></a> | ||
| > Default `pr_footer`: | ||
| > ``` | ||
| > --- | ||
| > | ||
| > *This PR was auto-generated by [claude-autosolve-action](https://github.com/cockroachdb/actions) using Claude Code.* | ||
| > *Please review carefully before approving.* | ||
| > ``` | ||
|
|
||
| **Outputs:** | ||
|
|
||
| | Name | Description | | ||
| | ------------- | -------------------------------------------- | | ||
| | `status` | `SUCCESS` or `FAILED` | | ||
| | `pr_url` | URL of the created PR | | ||
| | `summary` | Human-readable summary | | ||
| | `result` | Full Claude result text | | ||
| | `branch_name` | Name of the branch pushed to the fork | | ||
|
|
||
| **Features:** | ||
|
|
||
| - Retries implementation up to `max_retries` times on failure | ||
| - Enforces blocked-path restrictions (`.github/` is always blocked) | ||
| - Detects and rejects sensitive files (credentials, keys, `.env`) | ||
| - Runs an AI-powered security review on all changes before committing | ||
| - Pushes changes to a fork and creates a PR on the upstream repository | ||
| - Tracks Claude token usage | ||
|
|
||
| <a id="token-permissions"></a> | ||
| **Token permissions:** | ||
|
|
||
| | Token | Fine-grained | Classic | | ||
| | ------------------ | ------------------------------------------- | ------- | | ||
| | `fork_push_token` | `contents: write` on the fork repository | `repo` | | ||
| | `pr_create_token` | `pull_requests: write` on the target repository | `repo` | | ||
|
|
||
| Applying labels (`pr_labels`) requires `issues: write` on the target repo | ||
| (already covered by `repo` for classic tokens). If the token lacks this | ||
| permission, the action logs a warning and creates the PR without labels. | ||
|
|
||
| For organizations using SAML/SSO, the PAT must be authorized for the | ||
| organization that owns the target repository. See | ||
| [GitHub docs on SSO authorization](https://docs.github.com/en/enterprise-cloud@latest/authentication/authenticating-with-saml-single-sign-on/authorizing-a-personal-access-token-for-use-with-saml-single-sign-on). | ||
|
|
||
| ### get-workflow-ref | ||
|
|
||
| Resolves the git ref that a caller used to invoke a reusable workflow by parsing | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| .PHONY: build test clean | ||
|
|
||
| # Local dev binary | ||
| build: | ||
| go build -o autosolve ./cmd/autosolve | ||
|
|
||
| test: | ||
| go test ./... -count=1 | ||
|
|
||
| clean: | ||
| rm -f autosolve |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,127 @@ | ||
| name: Autosolve Assess | ||
| description: Run Claude in read-only mode to assess whether a task is suitable for automated resolution. | ||
|
|
||
| inputs: | ||
| claude_cli_version: | ||
| description: "Claude CLI version to install (e.g. '2.1.79' or 'latest')." | ||
| required: false | ||
| default: "2.1.79" | ||
| system_prompt: | ||
| description: > | ||
| Trusted instructions for Claude describing the task to assess. | ||
| Do not embed untrusted user input (e.g., issue titles/bodies) here. | ||
| Pass user-supplied data via environment variables and list them in context_vars. | ||
| required: false | ||
| default: "" | ||
| skill: | ||
| description: Path to a skill/prompt file relative to GITHUB_WORKSPACE. | ||
| required: false | ||
| default: "" | ||
| context_vars: | ||
| description: > | ||
| Comma-separated list of environment variable names to pass through to Claude. | ||
| Use this to provide untrusted user input (e.g., issue titles/bodies) safely. | ||
| Claude is automatically told which variables are available and instructed to | ||
| read them — you do not need to reference them in system_prompt. | ||
| Claude will only have access to these variables plus a baseline set of | ||
| system and authentication variables (PATH, HOME, etc.). | ||
| required: false | ||
| default: "" | ||
| assessment_criteria: | ||
| description: Custom criteria for the assessment. If not provided, uses default criteria. | ||
| required: false | ||
| default: "" | ||
| model: | ||
| description: Claude model ID. | ||
| required: false | ||
| default: "claude-opus-4-6" | ||
| blocked_paths: | ||
| description: > | ||
| Comma-separated path prefixes that cannot be modified. | ||
| .github/ is always blocked and cannot be removed. | ||
| required: false | ||
| default: "" | ||
| log_level: | ||
| description: > | ||
| Controls how much Claude output streams to the step log. | ||
| "error" (default) logs only errors and final status (token counts, result). | ||
| "info" adds the result summary (turns, duration, cost) and warns on | ||
| permission denials. | ||
| "debug" streams everything including all tool calls, assistant text, | ||
| and tool I/O. | ||
| Info and debug may contain source code snippets or environment | ||
| variable values. Security review output is never logged regardless | ||
| of this setting. | ||
| required: false | ||
| default: "error" | ||
| working_directory: | ||
| description: Directory to run in (relative to workspace root). Defaults to workspace root. | ||
| required: false | ||
| default: "." | ||
|
|
||
| outputs: | ||
| assessment: | ||
| description: PROCEED or SKIP | ||
| value: ${{ steps.assess.outputs.assessment }} | ||
| summary: | ||
| description: Human-readable assessment reasoning. | ||
| value: ${{ steps.assess.outputs.summary }} | ||
| result: | ||
| description: Full Claude result text. | ||
| value: ${{ steps.assess.outputs.result }} | ||
|
|
||
| runs: | ||
| using: "composite" | ||
| steps: | ||
| - name: Set up Claude CLI | ||
| shell: bash | ||
| run: | | ||
| if command -v roachdev >/dev/null; then | ||
| printf '#!/bin/sh\nexec roachdev claude -- "$@"\n' > /usr/local/bin/claude | ||
| chmod +x /usr/local/bin/claude | ||
| echo "Claude CLI: using roachdev wrapper ($(roachdev version))" | ||
| else | ||
| curl --fail --silent --show-error --location https://claude.ai/install.sh | bash -s -- "$CLAUDE_CLI_VERSION" | ||
| echo "Claude CLI installed: $(claude --version)" | ||
| fi | ||
| env: | ||
| CLAUDE_CLI_VERSION: ${{ inputs.claude_cli_version }} | ||
|
|
||
| - name: Check for existing build | ||
| id: check-build | ||
| shell: bash | ||
| run: | | ||
| if [ -x "$RUNNER_TEMP/autosolve" ]; then | ||
| echo "skip_build=true" >> "$GITHUB_OUTPUT" | ||
| echo "autosolve binary already available, skipping Go setup and build" | ||
| elif command -v go >/dev/null; then | ||
| echo "skip_go=true" >> "$GITHUB_OUTPUT" | ||
| echo "Go already available ($(go version)), skipping setup-go" | ||
| fi | ||
|
|
||
| - name: Set up Go | ||
| if: steps.check-build.outputs.skip_build != 'true' && steps.check-build.outputs.skip_go != 'true' | ||
| uses: actions/setup-go@v6 | ||
| with: | ||
| go-version-file: ${{ github.action_path }}/../go.mod | ||
| cache: false | ||
|
|
||
| - name: Build autosolve | ||
| if: steps.check-build.outputs.skip_build != 'true' | ||
| shell: bash | ||
| run: go build -trimpath -o "$RUNNER_TEMP/autosolve" ./cmd/autosolve | ||
| working-directory: ${{ github.action_path }}/.. | ||
|
fantapop marked this conversation as resolved.
|
||
|
|
||
| - name: Run assessment | ||
| id: assess | ||
| shell: bash | ||
| working-directory: ${{ inputs.working_directory }} | ||
| run: $RUNNER_TEMP/autosolve assess | ||
| env: | ||
| INPUT_SYSTEM_PROMPT: ${{ inputs.system_prompt }} | ||
| INPUT_SKILL: ${{ inputs.skill }} | ||
| INPUT_CONTEXT_VARS: ${{ inputs.context_vars }} | ||
| INPUT_ASSESSMENT_CRITERIA: ${{ inputs.assessment_criteria }} | ||
| INPUT_MODEL: ${{ inputs.model }} | ||
| INPUT_BLOCKED_PATHS: ${{ inputs.blocked_paths }} | ||
| INPUT_LOG_LEVEL: ${{ inputs.log_level }} | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful to somehow indicate which fields are required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will mark required fields in the input tables. For both actions, `system_prompt` or `skill` (at least one) and `model` are required. 🤖
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, now instead of a default for required fields it either says required or one required with an additional explanation