Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
e323423
Add autosolve actions for automated issue resolution
fantapop Mar 27, 2026
78d1dcc
Address PR review feedback (batch 1)
fantapop Apr 3, 2026
b02a463
Replace credential helper with GIT_ASKPASS for fork authentication
fantapop Apr 3, 2026
11b3ba2
Fix marker extraction to use last occurrence in output
fantapop Apr 3, 2026
3fe1297
Fail closed on symlink resolution errors in security check
fantapop Apr 3, 2026
a5502d2
Reset staged changes on all security review failures
fantapop Apr 3, 2026
31b7e13
Add unit tests for pure helpers and require PR body file
fantapop Apr 3, 2026
599c7a2
Validate boolean inputs with case-insensitive parsing
fantapop Apr 3, 2026
c91aab2
Simplify Claude Runner interface and remove dead code
fantapop Apr 3, 2026
45e2c78
Propagate errors instead of swallowing them
fantapop Apr 3, 2026
226b573
Mitigate prompt injection in AI security review
fantapop Apr 6, 2026
6406103
Remove additional_instructions input
fantapop Apr 6, 2026
96bc22e
Harden prompt injection defenses with context_vars and env filtering
fantapop Apr 7, 2026
01b51e1
Always block .github/ in blocked paths
fantapop Apr 7, 2026
0fa1d12
Fix action.yml issues found during integration testing
fantapop Apr 7, 2026
09ccb21
Prevent sensitive data in Claude output logs
fantapop Apr 8, 2026
7d542de
autosolve: Pretty-print JSON in collapsible log output
fantapop Apr 8, 2026
34ec73c
autosolve: Allow assess to read context_vars via printenv
fantapop Apr 8, 2026
4bc5058
autosolve: Default pr_base_branch to main and remove SymbolicRef
fantapop Apr 9, 2026
de8989b
autosolve: Disable Go module caching in setup-go
fantapop Apr 9, 2026
555a3c5
autosolve: Exit non-zero when implementation fails
fantapop Apr 9, 2026
750fc10
autosolve: Add pr_target_repo input for cross-repo PR creation
fantapop May 5, 2026
de8f400
autosolve: Document assess and implement actions in README
fantapop May 5, 2026
0b3c1f5
autosolve: Skip setup-go when Go is already available
fantapop May 5, 2026
b8b523b
autosolve: Remove dead code (LoadSecurityConfig, git.Client.Log)
fantapop May 6, 2026
9adf4f9
autosolve: Add log_level input with streaming output
fantapop May 6, 2026
51c0306
autosolve: Fail fast when target branch already exists on fork
fantapop May 7, 2026
8d5c71c
autosolve: Restrict security review to read-only Bash tools
fantapop May 7, 2026
b793a01
autosolve: Update Go version to 1.26
fantapop May 8, 2026
69a15e3
autosolve: Handle nil results and surface parse errors
fantapop May 8, 2026
cf3a541
autosolve: Detect symlinks that escape the repo root
fantapop May 8, 2026
2f46718
autosolve: Remove create_pr option, always create PRs
fantapop May 8, 2026
86abffc
autosolve: Improve test coverage and documentation
fantapop May 8, 2026
4084a28
autosolve: Log roachdev version when using wrapper
fantapop May 8, 2026
5a2aa7a
autosolve: Document required inputs and fix skill path description
fantapop May 8, 2026
c863d36
autosolve: Make label creation best-effort and document token permiss…
fantapop May 8, 2026
d695b1b
autosolve: Resolve skill path relative to GITHUB_WORKSPACE
fantapop May 8, 2026
7f6b59e
autosolve: Use descriptive names for Claude output files
fantapop May 8, 2026
e1d804a
autosolve: Add ErrEmptyResult sentinel and improve test assertions
fantapop May 8, 2026
d087c35
autosolve: Add security review tests and fix test artifact cleanup
fantapop May 8, 2026
a42f94b
autosolve: Refactor pushAndPR into focused functions
fantapop May 8, 2026
ca4d75e
autosolve: Update tests to use .github/ blocked path
fantapop May 8, 2026
8174636
autosolve: Document full allowed_tools default in README
fantapop May 8, 2026
00899a8
autosolve: Extract default assessment criteria to a linkable file
fantapop May 8, 2026
9d8779a
autosolve: Remove pr_title and pr_body_template inputs
fantapop May 9, 2026
fc2ca19
autosolve: Document classic PAT scopes and simplify label guidance
fantapop May 9, 2026
efcefd5
autosolve: Fix unchecked errors and use slices.Contains
fantapop May 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- run: ./test.sh
- uses: actions/setup-go@v6
with:
go-version-file: autosolve/go.mod
- name: Run shell tests
run: ./test.sh
- name: Run Go tests
run: cd autosolve && go test ./... -count=1
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ Breaking changes are prefixed with "Breaking Change: ".

### Added

- `autosolve/assess` and `autosolve/implement` actions: evaluate tasks for
automated resolution and autonomously implement solutions using Claude.
Includes AI security review, token usage tracking, structured log levels
(error/info/debug), and fast-fail when the target branch already exists.
- `create-release-pr` reusable workflow: automates version bump PRs by checking for
unreleased changes in CHANGELOG, extracting the next version, updating
the CHANGELOG with new version and release date, optionally running custom update
Expand Down
150 changes: 150 additions & 0 deletions README.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to somehow indicate which fields are required

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will mark required fields in the input tables. For both actions, `system_prompt` or `skill` (at least one) and `model` are required. 🤖

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, now instead of a default for required fields it either says required or one required with an additional explanation

Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,156 @@ determine whether a major, minor, or patch version bump is needed.
- Returns empty `bump_type` when there are no unreleased changes
- Follows semantic versioning principles

### autosolve/assess

Runs Claude in read-only mode to assess whether a task is suitable for automated
resolution. Claude evaluates the task against configurable criteria and returns a
PROCEED or SKIP decision with reasoning.

**Usage:**

```yaml
- uses: cockroachdb/actions/autosolve/assess@v0
with:
system_prompt: "Assess whether this issue can be resolved automatically."
context_vars: "ISSUE_TITLE,ISSUE_BODY"
env:
ISSUE_TITLE: ${{ github.event.issue.title }}
ISSUE_BODY: ${{ github.event.issue.body }}
```

**Inputs:**

| Name | Default | Description |
| --------------------- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| `claude_cli_version` | `2.1.79` | Claude CLI version to install (e.g. `2.1.79` or `latest`) |
| `system_prompt` | **one required** | Trusted instructions for Claude describing the task to assess. Do not embed untrusted user input here — use `context_vars` instead. At least one of `system_prompt` or `skill` is required. |
| `skill` | **one required** | Path to a skill/prompt file relative to `GITHUB_WORKSPACE`. At least one of `system_prompt` or `skill` is required. |
| `context_vars` | `""` | Comma-separated list of environment variable names to pass through to Claude for untrusted user input (e.g., issue titles/bodies) |
| `assessment_criteria` | [see default](autosolve/internal/prompt/templates/default-assessment-criteria.md) | Trusted criteria for the assessment. Do not embed untrusted user input. |
| `model` | `claude-opus-4-6` | Claude model ID |
| `blocked_paths` | `""` | Comma-separated path prefixes that cannot be modified (case-sensitive). `.github/` is always blocked. |
| `log_level` | `error` | Controls Claude output in the step log: `error` (status only), `info` (result summary, permission denial warnings), `debug` (stream everything). |
| `working_directory` | `.` | Directory to run in (relative to workspace root) |

**Outputs:**

| Name | Description |
| ------------ | ---------------------------------- |
| `assessment` | `PROCEED` or `SKIP` |
| `summary` | Human-readable assessment reasoning |
| `result` | Full Claude result text |

**Features:**

- Runs Claude in read-only mode (Read, Grep, Glob only) — no file modifications
- Safely passes untrusted user input via environment variables instead of prompt injection
- Supports custom assessment criteria or skill files
- Designed to gate the more expensive `autosolve/implement` step

### autosolve/implement

Runs Claude to implement a solution, validates changes with a security review,
pushes to a fork, and creates a pull request. Includes retry logic, blocked-path
enforcement, sensitive file detection, and token usage tracking.

**Usage:**

```yaml
- uses: cockroachdb/actions/autosolve/implement@v0
with:
system_prompt: "Fix the issue described in the environment variables."
context_vars: "ISSUE_TITLE,ISSUE_BODY"
fork_owner: my-bot
fork_repo: my-repo-fork
fork_push_token: ${{ secrets.FORK_PAT }}
pr_create_token: ${{ secrets.PR_PAT }}
env:
ISSUE_TITLE: ${{ github.event.issue.title }}
ISSUE_BODY: ${{ github.event.issue.body }}
```

**Inputs:**

| Name | Default | Description |
| -------------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| `claude_cli_version` | `2.1.79` | Claude CLI version to install (e.g. `2.1.79` or `latest`) |
| `system_prompt` | **one required** | Trusted instructions for Claude describing the task. Do not embed untrusted user input — use `context_vars`. At least one of `system_prompt` or `skill` is required. |
| `skill` | **one required** | Path to a skill/prompt file relative to `GITHUB_WORKSPACE`. At least one of `system_prompt` or `skill` is required. |
| `context_vars` | `""` | Comma-separated list of environment variable names to pass through to Claude for untrusted user input |
| `allowed_tools` | [see below](#allowed_tools-default) | Claude `--allowedTools` string |
| `model` | `claude-opus-4-6` | Claude model ID |
| `max_retries` | `3` | Maximum implementation attempts |
| `pr_target_repo` | `${{ github.repository }}` | Repository where the PR is created (`owner/repo`). Set this when the PR should target a different repo than the one running the workflow. |
| `pr_base_branch` | `main` | Base branch for the PR |
| `pr_labels` | `autosolve` | Comma-separated labels to apply to the PR |
| `pr_draft` | `true` | Whether to create the PR as a draft |
| `fork_owner` | **required** | GitHub username or org that owns the fork |
| `fork_repo` | **required** | Repository name of the fork |
| `fork_push_token` | **required** | PAT with `contents: write` on the fork repository |
| `pr_create_token` | **required** | PAT with `pull_requests: write` on the target repo (see [Token permissions](#token-permissions)) |
| `blocked_paths` | `""` | Comma-separated path prefixes that cannot be modified (case-sensitive). `.github/` is always blocked. |
| `git_user_name` | `autosolve[bot]` | Git author/committer name |
| `git_user_email` | `autosolve[bot]@users.noreply.github.com` | Git author/committer email |
| `branch_prefix` | `autosolve/` | Prefix for the branch name |
| `branch_suffix` | `""` | Suffix for branch name. Defaults to timestamp. |
| `commit_signature` | `Co-Authored-By: Claude <[email protected]>` | Signature line appended to commit messages |
| `pr_footer` | [see below](#pr_footer-default) | Footer appended to the PR body |
| `log_level` | `error` | Controls Claude output in the step log: `error` (status only), `info` (result summary, permission denial warnings), `debug` (stream everything). |
| `working_directory` | `.` | Directory to run in (relative to workspace root) |

<a id="allowed_tools-default"></a>
> Default `allowed_tools`:
> ```
> Read,Write,Edit,Grep,Glob,
> Bash(git add:*),Bash(git status:*),Bash(git diff:*),Bash(git log:*),Bash(git show:*),
> Bash(go build:*),Bash(go test:*),Bash(go vet:*),Bash(make:*)
> ```

<a id="pr_footer-default"></a>
> Default `pr_footer`:
> ```
> ---
>
> *This PR was auto-generated by [claude-autosolve-action](https://github.com/cockroachdb/actions) using Claude Code.*
> *Please review carefully before approving.*
> ```

**Outputs:**

| Name | Description |
| ------------- | -------------------------------------------- |
| `status` | `SUCCESS` or `FAILED` |
| `pr_url` | URL of the created PR |
| `summary` | Human-readable summary |
| `result` | Full Claude result text |
| `branch_name` | Name of the branch pushed to the fork |

**Features:**

- Retries implementation up to `max_retries` times on failure
- Enforces blocked-path restrictions (`.github/` is always blocked)
- Detects and rejects sensitive files (credentials, keys, `.env`)
- Runs an AI-powered security review on all changes before committing
- Pushes changes to a fork and creates a PR on the upstream repository
- Tracks Claude token usage

<a id="token-permissions"></a>
**Token permissions:**

| Token | Fine-grained | Classic |
| ------------------ | ------------------------------------------- | ------- |
| `fork_push_token` | `contents: write` on the fork repository | `repo` |
| `pr_create_token` | `pull_requests: write` on the target repository | `repo` |

Applying labels (`pr_labels`) requires `issues: write` on the target repo
(already covered by `repo` for classic tokens). If the token lacks this
permission, the action logs a warning and creates the PR without labels.

For organizations using SAML/SSO, the PAT must be authorized for the
organization that owns the target repository. See
[GitHub docs on SSO authorization](https://docs.github.com/en/enterprise-cloud@latest/authentication/authenticating-with-saml-single-sign-on/authorizing-a-personal-access-token-for-use-with-saml-single-sign-on).

### get-workflow-ref

Resolves the git ref that a caller used to invoke a reusable workflow by parsing
Expand Down
11 changes: 11 additions & 0 deletions autosolve/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.PHONY: build test clean

# Local dev binary
build:
go build -o autosolve ./cmd/autosolve

test:
go test ./... -count=1

clean:
rm -f autosolve
127 changes: 127 additions & 0 deletions autosolve/assess/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
name: Autosolve Assess
description: Run Claude in read-only mode to assess whether a task is suitable for automated resolution.

inputs:
claude_cli_version:
description: "Claude CLI version to install (e.g. '2.1.79' or 'latest')."
required: false
default: "2.1.79"
system_prompt:
description: >
Trusted instructions for Claude describing the task to assess.
Do not embed untrusted user input (e.g., issue titles/bodies) here.
Pass user-supplied data via environment variables and list them in context_vars.
required: false
default: ""
skill:
description: Path to a skill/prompt file relative to GITHUB_WORKSPACE.
required: false
default: ""
context_vars:
description: >
Comma-separated list of environment variable names to pass through to Claude.
Use this to provide untrusted user input (e.g., issue titles/bodies) safely.
Claude is automatically told which variables are available and instructed to
read them — you do not need to reference them in system_prompt.
Claude will only have access to these variables plus a baseline set of
system and authentication variables (PATH, HOME, etc.).
required: false
default: ""
assessment_criteria:
description: Custom criteria for the assessment. If not provided, uses default criteria.
required: false
default: ""
model:
description: Claude model ID.
required: false
default: "claude-opus-4-6"
blocked_paths:
description: >
Comma-separated path prefixes that cannot be modified.
.github/ is always blocked and cannot be removed.
required: false
default: ""
log_level:
description: >
Controls how much Claude output streams to the step log.
"error" (default) logs only errors and final status (token counts, result).
"info" adds the result summary (turns, duration, cost) and warns on
permission denials.
"debug" streams everything including all tool calls, assistant text,
and tool I/O.
Info and debug may contain source code snippets or environment
variable values. Security review output is never logged regardless
of this setting.
required: false
default: "error"
working_directory:
description: Directory to run in (relative to workspace root). Defaults to workspace root.
required: false
default: "."

outputs:
assessment:
description: PROCEED or SKIP
value: ${{ steps.assess.outputs.assessment }}
summary:
description: Human-readable assessment reasoning.
value: ${{ steps.assess.outputs.summary }}
result:
description: Full Claude result text.
value: ${{ steps.assess.outputs.result }}

runs:
using: "composite"
steps:
- name: Set up Claude CLI
shell: bash
run: |
if command -v roachdev >/dev/null; then
printf '#!/bin/sh\nexec roachdev claude -- "$@"\n' > /usr/local/bin/claude
chmod +x /usr/local/bin/claude
echo "Claude CLI: using roachdev wrapper ($(roachdev version))"
else
curl --fail --silent --show-error --location https://claude.ai/install.sh | bash -s -- "$CLAUDE_CLI_VERSION"
echo "Claude CLI installed: $(claude --version)"
fi
env:
CLAUDE_CLI_VERSION: ${{ inputs.claude_cli_version }}

- name: Check for existing build
id: check-build
shell: bash
run: |
if [ -x "$RUNNER_TEMP/autosolve" ]; then
echo "skip_build=true" >> "$GITHUB_OUTPUT"
echo "autosolve binary already available, skipping Go setup and build"
elif command -v go >/dev/null; then
echo "skip_go=true" >> "$GITHUB_OUTPUT"
echo "Go already available ($(go version)), skipping setup-go"
fi

- name: Set up Go
if: steps.check-build.outputs.skip_build != 'true' && steps.check-build.outputs.skip_go != 'true'
uses: actions/setup-go@v6
with:
go-version-file: ${{ github.action_path }}/../go.mod
cache: false

- name: Build autosolve
if: steps.check-build.outputs.skip_build != 'true'
shell: bash
run: go build -trimpath -o "$RUNNER_TEMP/autosolve" ./cmd/autosolve
working-directory: ${{ github.action_path }}/..
Comment thread
fantapop marked this conversation as resolved.

- name: Run assessment
id: assess
shell: bash
working-directory: ${{ inputs.working_directory }}
run: $RUNNER_TEMP/autosolve assess
env:
INPUT_SYSTEM_PROMPT: ${{ inputs.system_prompt }}
INPUT_SKILL: ${{ inputs.skill }}
INPUT_CONTEXT_VARS: ${{ inputs.context_vars }}
INPUT_ASSESSMENT_CRITERIA: ${{ inputs.assessment_criteria }}
INPUT_MODEL: ${{ inputs.model }}
INPUT_BLOCKED_PATHS: ${{ inputs.blocked_paths }}
INPUT_LOG_LEVEL: ${{ inputs.log_level }}
Loading