Skip to content

[CNSL-1930] Add automated OpenAPI spec sync workflow#108

Open
linhcrl wants to merge 1 commit intocockroachdb:mainfrom
linhcrl:ccapi-update
Open

[CNSL-1930] Add automated OpenAPI spec sync workflow#108
linhcrl wants to merge 1 commit intocockroachdb:mainfrom
linhcrl:ccapi-update

Conversation

@linhcrl
Copy link
Copy Markdown
Contributor

@linhcrl linhcrl commented Apr 25, 2026

Example PR opened by workflow

Note: in the case of the "merged" event, there's an additional trailer in the commit message


This workflow automates synchronization of OpenAPI specs from the managed-service repository. When managed-service dispatches openapi-spec-changed or openapi-spec-merged events:

  • Fetches the latest OpenAPI spec from the managed-service PR branch
  • Regenerates SDK client code using make generate-openapi-client
  • Invokes Claude AI (via Vertex AI) to analyze changes and update CHANGELOG.md with appropriate entries
  • Creates or updates PR from bot fork to pending-deploy branch

The workflow includes scripts for finding corresponding SDK PRs and invoking Claude for changelog generation. The Claude prompt instructs it to analyze git diffs, update CHANGELOG.md following Keep a Changelog conventions, identify breaking changes, and output structured metadata for commit messages and PR descriptions.

Supports manual testing via workflow_dispatch trigger.

@linhcrl linhcrl force-pushed the ccapi-update branch 3 times, most recently from b332d74 to a8dce22 Compare April 25, 2026 01:39
@linhcrl linhcrl changed the title Add automated OpenAPI spec sync workflow [CNSL-1930] Add automated OpenAPI spec sync workflow Apr 25, 2026
@linhcrl linhcrl requested a review from fantapop April 28, 2026 17:12
Copy link
Copy Markdown
Contributor

@fantapop fantapop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some feedback items here but as I was getting too the bottom, it felt like a lot of logic to be putting in a local workflow file. Since this is all happening in this repo and we need to check out the repo anyways could this all live in a single build script? For example take a look at the release script in ccloud cli: https://github.com/cockroachlabs/ccloud-private/blob/main/build/release.sh

Comment thread .github/workflows/openapi-sync.yml Outdated
Comment thread .github/workflows/openapi-sync.yml Outdated
Comment thread .github/workflows/openapi-sync.yml Outdated
Comment thread .github/workflows/openapi-sync.yml Outdated
Comment thread .github/workflows/openapi-sync.yml Outdated
Comment thread .github/workflows/openapi-sync.yml Outdated
Comment thread .github/workflows/openapi-sync.yml Outdated
Comment thread .github/workflows/openapi-sync.yml Outdated
@linhcrl linhcrl force-pushed the ccapi-update branch 2 times, most recently from 7c4c776 to e89296d Compare May 1, 2026 02:24
@linhcrl linhcrl requested a review from fantapop May 1, 2026 02:26
@linhcrl linhcrl force-pushed the ccapi-update branch 4 times, most recently from 6fc7166 to ee7abbc Compare May 5, 2026 00:28
Comment thread .github/workflows/openapi-sync.yml Outdated

- name: Run OpenAPI sync workflow
run: |
chmod +x scripts/openapi-sync.sh
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just add +x to the actual file in the repo so it's checked out with execute

Comment thread scripts/openapi-sync.sh Outdated
# MANAGED_SERVICE_TOKEN: GitHub token with managed-service read access
# FORK_PUSH_TOKEN: GitHub token with fork push access
# FORK_OWNER: GitHub username that owns the fork
# GITHUB_TOKEN: GitHub token for creating PRs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming seems inconsistent here. How about CREATE_PR_TOKEN?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

Comment thread scripts/openapi-sync.sh
fi

export GH_TOKEN="$MANAGED_SERVICE_TOKEN"
gh api "repos/$MS_OWNER/$MS_REPO/contents/$SPEC_PATH_IN_MS?ref=$REF" \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add a check at the top of the file to ensure gh is in the PATH. I know the actions will have this by default but in case someone tries to run it locally I think it would be good to have it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to check using a helper I added called check_required_commands

Comment thread scripts/openapi-sync.sh
Comment on lines +150 to +151
sudo make generate-openapi-client
sudo chown --recursive "$(id --user):$(id --group)" internal/openapi-generator/ pkg/client/ docs/ README.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised we need sudo here. Is there a way to get around that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sudo is needed because when GitHub Actions runs this script, it operates as a non-root user. The docker compose commands in the Makefile run containers as root by default, which creates the generated files (in pkg/client/, docs/, etc.) owned by root. The GitHub Actions user then cannot access or commit these files. We use sudo to run the make command, and then sudo chown to change ownership back to the GitHub Actions user so it can stage and commit the changes.

An alternative approach would be to configure the Docker Compose services to run as the current user by setting the user field in docker-compose.yml for both the jq and openapi-generator services. The script would then need to export UID and GID environment variables before calling make generate-openapi-client, and we could remove both sudo calls.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay got it. Let's not worry about it. Unless you've already worked on it and have it figured out..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept sudo for now

Comment thread scripts/openapi-sync.sh
Comment on lines +192 to +202
if [[ "$EVENT_TYPE" == "openapi-spec-merged" ]]; then
log_info "No changes detected, but adding SHA trailer to existing commit"

CURRENT_COMMIT_MSG=$(git log -1 --format=%B)

if echo "$CURRENT_COMMIT_MSG" | grep --quiet "Managed-service-commit-SHA: $MANAGED_SERVICE_SHA"; then
log_info "SHA trailer already exists in commit"
else
NEW_COMMIT_MSG=$(printf "%s\nManaged-service-commit-SHA: %s" "$CURRENT_COMMIT_MSG" "$MANAGED_SERVICE_SHA")
git commit --amend --message "$NEW_COMMIT_MSG"
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you tell me more about how this happens in the workflow? I can't quite tell what's going on here. If a commit has already landed in main we shouldn't try to amend it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment above the code block to explain what's happening here. Let me know if that helps

Comment thread scripts/lib/invoke-claude-changelog.sh Outdated
MANAGED_SERVICE_SHA="${2:-}"

if [[ -z "${MANAGED_SERVICE_PR_URL:-}" ]]; then
echo "::error::managed_service_pr_url is required" >&2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these use the logging helpers?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Comment thread scripts/lib/logging.sh Outdated
# Output an informational message to stderr
log_info() {
local message="$1"
echo "$message" >&2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I may have asked this in another PR but it seems like info and notice level logs should go to stdout if possible.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we are directing info and notice to stdout

local managed_service_pr_url="$1"

if [[ -z "${managed_service_pr_url:-}" ]]; then
log_error "managed_service_pr_url is required"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to include the logging.sh? Or is it assumed to included by the invoking script already? If it's assumed we should probably check that it's available when this script is invoked and fail with a message in case someone tries to use it differently.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i decided to source in each file

Comment thread scripts/lib/git-askpass.sh Outdated
set -e

if [ -z "$GIT_FORK_USER" ] || [ -z "$GIT_FORK_PASSWORD" ]; then
echo "::error::GIT_FORK_USER and GIT_FORK_PASSWORD must be set" >&2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this use the logging helpers?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Comment thread scripts/lib/invoke-claude-changelog.sh Outdated
Comment on lines +41 to +43
# Create prompt from template with substitutions
PROMPT_FILE=$(mktemp)
sed "s|{{MANAGED_SERVICE_PR_URL}}|$MANAGED_SERVICE_PR_URL|g" "$PROMPT_TEMPLATE" > "$PROMPT_FILE"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: templatizing with this url doesn't seem to dangerous but in generally it's a pattern that can lead to prompt injection. Another way you can do this is to make sure the url is in the env that claude is called with and tell it what the env var is in the prompt. For example

  1. Look up the managed service url from the environment using printenv MANAGED_SERVICE_PR_URL
  2. ensure.. the trailer exists and contains the text in this exact format but with the value of the env var: managed-service-url: <MANAGED_SERVICE_PR_URL>
  3. etc...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@linhcrl linhcrl force-pushed the ccapi-update branch 2 times, most recently from 209781d to d4b826d Compare May 7, 2026 00:43
This workflow automates synchronization of OpenAPI specs from the
managed-service repository. When triggered via workflow_dispatch with
`openapi-spec-changed` or `openapi-spec-merged` event types:

- Fetches the latest OpenAPI spec from the managed-service PR branch
  (for changed events) or from the specific commit SHA (for merged events)
- Regenerates SDK client code using make generate-openapi-client
- Invokes Claude AI (via Vertex AI) to analyze changes and update
  CHANGELOG.md with appropriate entries
- Creates or updates PR from bot fork to pending-deploy branch

The workflow includes scripts for finding corresponding SDK PRs and
invoking Claude for changelog generation. The Claude prompt instructs
it to analyze git diffs, update CHANGELOG.md following Keep a
Changelog conventions, identify breaking changes, and output
structured metadata for commit messages and PR descriptions.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@linhcrl linhcrl requested a review from fantapop May 7, 2026 03:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants