CI: Prevent PR's that result in uncommitted changes#2884
Conversation
Problem statement: Parts of this repo that are tracked in git, but generated from build steps (e.g. Cargo.lock, .pb.go files, some of the files under docs/) may be changed but not committed as part of a PR. This results in situations where you build the main branch and uncommitted changes show up even though you didn't touch anything (leaving the next person to have to include those unrelated changes in their PR.) Fix this on a few fronts, by both checking for this scenario, and by eagerly running some code generation to surface these changes earlier: - Add a check-repo-clean.sh script that checks for uncommitted changes - Add a build step that runs code-gen in rest-api if the protobufs under crates/rpc/proto have changed (to make sure any deltas are included in the PR that caused them), then runs check-repo-clean to fail early if there are any deltas - Add a build step that runs check-repo-clean as a final check to catch anything else - Make `cargo make pre-commit-verify-workspace` run the codegen too, so that local development stands a decent chance of noticing the changes before the PR runs - Add tasks to Makefile.toml to trigger the rest-api codegen manually Also, run the checks now and check in the results so the repo is clean.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
Summary by CodeRabbit
WalkthroughCI now detects core RPC proto changes, regenerates REST protos when needed, and blocks release jobs until sync checks pass. Proto contracts expand across BlueField discovery, networking, validation, secrets, and inventory. Observability docs add new metric entries. ChangesREST proto sync and API surface
Observability metrics docs
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
🌿 Preview your docs: https://nvidia-preview-pull-request-2884.docs.buildwithfern.com/infra-controller |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-06-25 15:14:54 UTC | Commit: a29b7c7 |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
rest-api/flow/internal/nicoapi/nicoproto/nico.proto (1)
1501-1567: 🗄️ Data Integrity & Integration | 🟠 MajorDual-write VPC config/status and the legacy fields until the workflow layer is migrated.
rest-api/workflow/pkg/activity/vpc/vpc.gostill readsRoutingProfileType,NetworkSecurityGroupId, andStatus.Vnifrom the top-levelVpc; if this message only populatesconfig/status, those attributes will arrive empty and VPC sync will lose data.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rest-api/flow/internal/nicoapi/nicoproto/nico.proto` around lines 1501 - 1567, The Vpc proto now moves several values into config/status, but the workflow layer still reads the legacy top-level Vpc fields, so those values will be missing during sync. Update the Vpc write path to dual-write the affected values into both the new nested messages and the existing deprecated fields, especially routing_profile_type, network_security_group_id, and status.vni. Keep the legacy fields populated in Vpc alongside VpcConfig and VpcStatus until rest-api/workflow/pkg/activity/vpc/vpc.go is migrated.Source: Path instructions
Makefile.toml (1)
579-617: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winUse the sync-check task in
pre-commit-verify-workspace.Right now this aggregate only runs
generate-rest-core-proto, so a directcargo make pre-commit-verify-workspacecan leave tracked proto outputs dirty and still exit 0. Swapping this dependency tocheck-rest-core-proto-synckeeps generation and cleanliness verification coupled.Suggested fix
dependencies = [ - "generate-rest-core-proto", + "check-rest-core-proto-sync", "clippy-flow", "carbide-lints", "check-format-nightly",As per path instructions,
Makefile*changes should have clear failure behavior and remain consistent with documented verification commands.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Makefile.toml` around lines 579 - 617, `pre-commit-verify-workspace` currently depends on `generate-rest-core-proto`, which can succeed even when generated REST proto outputs remain dirty. Update the aggregate task to depend on `check-rest-core-proto-sync` instead, so `generate-rest-core-proto` and `check-repo-clean` stay coupled and the verification fails when tracked proto artifacts are not committed.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yaml:
- Around line 1057-1065: The check-rest-core-proto-sync job is inheriting
broader default GITHUB_TOKEN permissions and leaving checkout credentials
available to later steps. Harden the workflow by explicitly setting minimal
read-only permissions for this job and by adjusting the actions/checkout@v4 step
so it does not persist credentials for the subsequent Makefile/script execution.
Use the check-rest-core-proto-sync job and its checkout step as the place to
apply the fix.
In `@docs/observability/core_metrics.md`:
- Line 123: The observability docs entry uses a stale metric name suffix and
should match the emitted series from site-explorer. Update the table row in the
metrics documentation to use the exact name exported by
crates/site-explorer/src/metrics.rs, namely the
carbide_site_explorer_phase_latency symbol, so dashboards and alerts reference
the correct metric.
In `@rest-api/flow/Makefile`:
- Around line 20-23: The Makefile still allows any existing buf on PATH to be
used, so generation can run with the wrong version. Update the buf setup in the
Makefile to enforce the pinned v1.70.0 binary unconditionally by invoking the
installed binary directly or by checking buf --version and failing if it is not
the expected version. Keep the change focused around the existing buf
install/generate flow so the version used by buf generate is always the pinned
one.
In `@scripts/check-repo-clean.sh`:
- Around line 8-21: The clean-check in check-repo-clean.sh is only comparing the
worktree against the index, so staged regenerated files can still be reported as
clean. Update the repository-clean test in this script to also detect staged
changes by checking the index against HEAD (for example alongside the existing
git diff check), while keeping the untracked-files check intact. Make the
condition in the main guard around the current git diff / untracked logic fail
whenever either staged or unstaged changes exist.
---
Outside diff comments:
In `@Makefile.toml`:
- Around line 579-617: `pre-commit-verify-workspace` currently depends on
`generate-rest-core-proto`, which can succeed even when generated REST proto
outputs remain dirty. Update the aggregate task to depend on
`check-rest-core-proto-sync` instead, so `generate-rest-core-proto` and
`check-repo-clean` stay coupled and the verification fails when tracked proto
artifacts are not committed.
In `@rest-api/flow/internal/nicoapi/nicoproto/nico.proto`:
- Around line 1501-1567: The Vpc proto now moves several values into
config/status, but the workflow layer still reads the legacy top-level Vpc
fields, so those values will be missing during sync. Update the Vpc write path
to dual-write the affected values into both the new nested messages and the
existing deprecated fields, especially routing_profile_type,
network_security_group_id, and status.vni. Keep the legacy fields populated in
Vpc alongside VpcConfig and VpcStatus until
rest-api/workflow/pkg/activity/vpc/vpc.go is migrated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0f9db42d-1738-48ad-983c-cb7d2c81e9dd
⛔ Files ignored due to path filters (12)
rest-api/flow/internal/nicoapi/gen/fmds.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.gorest-api/flow/internal/nicoapi/gen/fmds_grpc.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.go,!rest-api/**/*_grpc.pb.gorest-api/flow/internal/nicoapi/gen/nico.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.gorest-api/flow/internal/nicoapi/gen/nico_grpc.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.go,!rest-api/**/*_grpc.pb.gorest-api/flow/internal/nicoapi/gen/site_explorer.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.gorest-api/workflow-schema/schema/site-agent/workflows/v1/dpa_rpc_nico.pb.gois excluded by!**/*.pb.go,!rest-api/**/*.pb.gorest-api/workflow-schema/schema/site-agent/workflows/v1/nico_nico.pb.gois excluded by!**/*.pb.go,!rest-api/**/*.pb.gorest-api/workflow-schema/schema/site-agent/workflows/v1/nico_nico_grpc.pb.gois excluded by!**/*.pb.go,!rest-api/**/*.pb.go,!rest-api/**/*_grpc.pb.gorest-api/workflow-schema/schema/site-agent/workflows/v1/site_explorer_nico.pb.gois excluded by!**/*.pb.go,!rest-api/**/*.pb.gorest-api/workflow-schema/site-agent/workflows/v1/dpa_rpc_nico.protois excluded by!rest-api/workflow-schema/site-agent/workflows/v1/*_nico.protorest-api/workflow-schema/site-agent/workflows/v1/nico_nico.protois excluded by!rest-api/workflow-schema/site-agent/workflows/v1/*_nico.protorest-api/workflow-schema/site-agent/workflows/v1/site_explorer_nico.protois excluded by!rest-api/workflow-schema/site-agent/workflows/v1/*_nico.proto
📒 Files selected for processing (12)
.github/workflows/ci.yaml.github/workflows/rest-ci.yml.github/workflows/rest-lint-and-test.ymlMakefile.tomldocs/observability/core_metrics.mdrest-api/Makefilerest-api/flow/Makefilerest-api/flow/internal/nicoapi/buf.gen.yamlrest-api/flow/internal/nicoapi/nicoproto/fmds.protorest-api/flow/internal/nicoapi/nicoproto/nico.protorest-api/flow/internal/nicoapi/nicoproto/site_explorer.protoscripts/check-repo-clean.sh
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
Also add some stuff to .gitignore that the github runner outputs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yaml:
- Line 1069: The workflow step using actions/checkout@v4 should be updated to an
immutable commit SHA to satisfy the repository’s pinned-action policy. Replace
the mutable tag in the checkout step with the exact SHA for the intended
version, and keep the change within the existing checkout usage in the CI
workflow so the action reference stays fixed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7ac3c2b7-75ce-4476-9de2-8482d815569d
📒 Files selected for processing (2)
.github/workflows/ci.yaml.gitignore
✅ Files skipped from review due to trivial changes (1)
- .gitignore
I'm intentionally running the codegen as part of verify-workspace, instead of just checking it, because it's a likely step that rust developers will actually run as part of their workflow. The goal here is to make it so when you change forge.proto, you actually run the build step that syncs the changes to rest-api. I didn't want to put it in e.g. crates/rpc/build.rs, because writing files outside of the crate is surprising behavior... and there's no other build step that people typically run locally that's a better candidate.
This is not part of this PR, I don't know why this was mentioned. I'm syncing the protobuf over, any changes are already part of the main branch by definition, this just ensures the golang "view" of this protobuf is properly reflected. Any dual writing behavior (or lack thereof) is already merged into main. |
thossain-nv
left a comment
There was a problem hiding this comment.
Thanks for the wiring @kensimon, this should be very helpful in ensuring the protobufs stay aligned.
|
|
||
| // SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
| /* |
There was a problem hiding this comment.
This seems like a duplicate?
Problem statement: Parts of this repo that are tracked in git, but generated from build steps (e.g. Cargo.lock, .pb.go files, some of the files under docs/) may be changed but not committed as part of a PR. This results in situations where you build the main branch and uncommitted changes show up even though you didn't touch anything (leaving the next person to have to include those unrelated changes in their PR.)
Fix this on a few fronts, by both checking for this scenario, and by eagerly running some code generation to surface these changes earlier:
cargo make pre-commit-verify-workspacerun the protobuf codegen too, so that local development stands a decent chance of noticing the changes before the PR runsAlso, run the checks now and check in the results so the repo is clean.
Related issues
#2883
Type of Change
Breaking Changes
Testing
Additional Notes
The deltas in the synced protos in rest-api are quite large... it seems like people only partially commit the changes they need when doing a sync, to avoid a PR getting large. I didn't run a full test suite on rest-api but it seems like the changes should all work (including deleting dpa_rpc_nico.proto, which AFAICT is unused), and this will be the last time we have a bunch of de-synced protobuf code between the two parts of the codebase.