Skip to content

feat(admin-cli): inspect a machine's MachineBootInterface across its lifecycle#2865

Merged
chet merged 1 commit into
NVIDIA:mainfrom
chet:machine-boot-interfaces-cli
Jun 25, 2026
Merged

feat(admin-cli): inspect a machine's MachineBootInterface across its lifecycle#2865
chet merged 1 commit into
NVIDIA:mainfrom
chet:machine-boot-interfaces-cli

Conversation

@chet

@chet chet commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

We formalized boot-interface management around a single MachineBootInterface and now track it across the machine lifecycle, but that view lives in four stores — the owned machine_interfaces rows, predicted_machine_interfaces, the explored_endpoints default, and the retained_boot_interfaces pairs we keep across a delete and re-ingest. This adds one command to see what each store believes a machine's boot interface is.

admin-cli machine boot-interfaces <machine-id> gathers all four stores for one machine (read-only) and prints them together as an ASCII table (default), JSON, or YAML — alongside the effective boot interface the system would select (pick_boot_interface) and a flag for when the stores disagree. Useful both for troubleshooting a wrong-NIC boot and for verifying a re-ingest settled on the right one.

What's here

  • A read-only GetMachineBootInterfaces Forge RPC + handler that gathers the four stores in one read transaction. The report messages include boot_interface_id explicitly (the existing MachineInterface message omits it).
  • A non-consuming, non-window-filtered retained_boot_interfaces read (find_records_by_macs) so the troubleshooting view surfaces stale records that the window-aware find_by_mac / take_by_mac hide or remove.
  • The machine boot-interfaces subcommand: the three renderers, the effective-pick summary line, and the divergence flag. Honors the global --output {ascii-table,json,yaml}.

Try it

nico-admin-cli machine boot-interfaces <machine-id>
nico-admin-cli --output json machine boot-interfaces <machine-id>

Tests

  • Unit coverage on the three renderers (table / JSON / YAML round-trip, every field present).
  • api-core integration tests that seed all four stores for a machine and assert the per-store entries, the effective pick, and the divergence flag.

Scope

v1 takes a MachineId (predicted hosts have one too). Expanding a host to also show its attached DPUs' entries is a follow-up.

Refs #2863. Part of #2660.

@chet chet requested a review from a team as a code owner June 24, 2026 22:19
@chet

chet commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai full_review

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

@chet Re-running a full review of this PR now.

✅ Action performed

Full review finished.

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new machine boot-interfaces admin CLI command and GetMachineBootInterfaces RPC with server-side aggregation, retained-record lookup, output rendering, and integration coverage for the per-machine boot-interface view.

Changes

Boot-interface troubleshooting flow

Layer / File(s) Summary
RPC contract and permissions
crates/rpc/proto/forge.proto, crates/rpc/build.rs, crates/api-core/src/api.rs, crates/api-core/src/auth/internal_rbac_rules.rs
Adds the GetMachineBootInterfaces proto messages, makes the generated response types serializable, exposes the Forge API method, and grants ForgeAdminCLI access.
Retained-record lookup
crates/api-db/src/retained_boot_interface.rs
Adds a typed retained-boot-interface row and a read helper that returns recorded_at-bearing rows for a list of MAC addresses without retention-window filtering.
Boot-interface aggregation handler
crates/api-core/src/handlers/machine_boot_interfaces.rs, crates/api-core/src/handlers/mod.rs, crates/api-core/src/tests/machine_boot_interfaces.rs, crates/api-core/src/tests/mod.rs
Reads owned, predicted, explored, and retained data for one machine, computes the effective boot interface and divergence flag, and adds integration coverage for the aggregated response.
Admin CLI command wiring
crates/admin-cli/src/machine/mod.rs, crates/admin-cli/src/machine/boot_interfaces/*, crates/admin-cli/src/rpc.rs
Adds the machine boot-interfaces subcommand, its machine argument, runtime dispatch, and the client wrapper used to call the new RPC.
CLI report mapping and rendering
crates/admin-cli/src/machine/boot_interfaces/cmd.rs
Maps the RPC response into report rows, renders JSON, YAML, and table output, and checks the rendered and serialized report shape.

Sequence Diagram(s)

sequenceDiagram
  participant Args as boot_interfaces::Args
  participant Handle as cmd::handle_boot_interfaces
  participant Client as ApiClient::get_machine_boot_interfaces
  participant Api as Api::get_machine_boot_interfaces
  participant Handler as machine_boot_interfaces::get_machine_boot_interfaces
  participant Render as cmd::render_tables
  Args->>Handle: run(RuntimeContext)
  Handle->>Client: request boot-interface view
  Client->>Api: send GetMachineBootInterfacesRequest
  Api->>Handler: delegate request
  Handler-->>Client: return GetMachineBootInterfacesResponse
  Client-->>Handle: receive response
  Handle->>Render: print JSON, YAML, or ASCII table
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the new admin-cli command for inspecting a machine’s boot interface lifecycle.
Description check ✅ Passed The description is directly aligned with the changeset and accurately summarizes the new RPC, CLI, and tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
crates/api-db/src/retained_boot_interface.rs (1)

105-116: 🚀 Performance & Scalability | 🔵 Trivial

Accept a read-only DbReader here

find_records_by_macs is a pure SELECT, so it should take impl DbReader<'_> (and import crate::db_read::DbReader) instead of &mut PgConnection. That keeps the helper usable with a PgPool directly and avoids unnecessary transaction boilerplate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-db/src/retained_boot_interface.rs` around lines 105 - 116, Update
find_records_by_macs in retained_boot_interface to accept a read-only
DbReader<'_> instead of &mut PgConnection, and import crate::db_read::DbReader.
Keep the SELECT query and fetch_all call the same, but switch the function
parameter and any related type usage so the helper works with a PgPool directly
without requiring a mutable transaction.

Source: Coding guidelines

crates/api-core/src/tests/machine_boot_interfaces.rs (1)

198-204: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Non-divergence assertion rests on an implicit fixture invariant.

This assertion holds only while the freshly ingested DPU host's explored-endpoint default resolves to the same boot NIC as the owned pick_boot_interface selection. That coupling is reasonable today, but it is an undocumented invariant of new_host; should the fixture ever record a differing boot_interface_mac, this test would flag a divergence and fail opaquely. Consider asserting the explored default's MAC equals effective_boot_interface_mac first, so a future fixture drift fails with an explicit, diagnosable message rather than at the final divergent check.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/tests/machine_boot_interfaces.rs` around lines 198 - 204,
The non-divergence check in machine_boot_interfaces test relies on an implicit
fixture guarantee that the explored default boot NIC matches the owned
pick_boot_interface result for new_host. Before asserting report.divergent is
false, explicitly verify that the explored default MAC matches
effective_boot_interface_mac so any future fixture drift fails with a clear
diagnostic tied to new_host and pick_boot_interface rather than only at the
final divergence assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/admin-cli/src/machine/boot_interfaces/args.rs`:
- Around line 28-30: The help examples in the boot-interfaces args text use an
inconsistent angle-bracket placeholder; update the examples in args.rs so the
machine identifier matches the canonical UUID style already used elsewhere,
keeping the output/json and output/yaml examples copy-pasteable and uniform.
Locate the help text in the boot-interfaces argument definitions and replace the
<MACHINE_ID> placeholder with the prescribed realistic UUID placeholder format.

In `@crates/admin-cli/src/machine/boot_interfaces/cmd.rs`:
- Around line 90-137: Add a table-driven test that exercises the `impl
From<forgerpc::GetMachineBootInterfacesResponse> for BootInterfacesReport`
mapping instead of constructing `BootInterfacesReport` directly. Build a
`GetMachineBootInterfacesResponse`, convert it with
`BootInterfacesReport::from`, and assert each mapped collection
(`machine_interfaces`, `predicted_interfaces`, `explored_endpoints`,
`retained_interfaces`) plus the top-level fields
(`effective_boot_interface_mac`, `effective_boot_interface_id`, `divergent`) are
populated as expected. Include cases that verify `non_empty` turns empty strings
into `None` and that `retained_interfaces.recorded_at` is rendered as RFC 3339
(`2026-06-01T00:00:00Z`).

---

Nitpick comments:
In `@crates/api-core/src/tests/machine_boot_interfaces.rs`:
- Around line 198-204: The non-divergence check in machine_boot_interfaces test
relies on an implicit fixture guarantee that the explored default boot NIC
matches the owned pick_boot_interface result for new_host. Before asserting
report.divergent is false, explicitly verify that the explored default MAC
matches effective_boot_interface_mac so any future fixture drift fails with a
clear diagnostic tied to new_host and pick_boot_interface rather than only at
the final divergence assertion.

In `@crates/api-db/src/retained_boot_interface.rs`:
- Around line 105-116: Update find_records_by_macs in retained_boot_interface to
accept a read-only DbReader<'_> instead of &mut PgConnection, and import
crate::db_read::DbReader. Keep the SELECT query and fetch_all call the same, but
switch the function parameter and any related type usage so the helper works
with a PgPool directly without requiring a mutable transaction.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 00e0da41-658a-40de-9e44-0df95493c3a5

📥 Commits

Reviewing files that changed from the base of the PR and between 0d4e13e and c88ebe3.

📒 Files selected for processing (14)
  • crates/admin-cli/src/machine/boot_interfaces/args.rs
  • crates/admin-cli/src/machine/boot_interfaces/cmd.rs
  • crates/admin-cli/src/machine/boot_interfaces/mod.rs
  • crates/admin-cli/src/machine/mod.rs
  • crates/admin-cli/src/rpc.rs
  • crates/api-core/src/api.rs
  • crates/api-core/src/auth/internal_rbac_rules.rs
  • crates/api-core/src/handlers/machine_boot_interfaces.rs
  • crates/api-core/src/handlers/mod.rs
  • crates/api-core/src/tests/machine_boot_interfaces.rs
  • crates/api-core/src/tests/mod.rs
  • crates/api-db/src/retained_boot_interface.rs
  • crates/rpc/build.rs
  • crates/rpc/proto/forge.proto

Comment thread crates/admin-cli/src/machine/boot_interfaces/args.rs Outdated
Comment thread crates/admin-cli/src/machine/boot_interfaces/cmd.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
crates/admin-cli/src/machine/boot_interfaces/args.rs (1)

29-30: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Replace <MACHINE_ID> with the canonical UUID placeholder in examples.

Use 12345678-1234-5678-90ab-cdef01234567 for consistency and copy-pasteability in both commands.
As per coding guidelines, “Use realistic, consistent placeholder values instead of angle brackets: UUIDs as 12345678-1234-5678-90ab-cdef01234567.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/admin-cli/src/machine/boot_interfaces/args.rs` around lines 29 - 30,
Update the example commands in the boot-interfaces args help text to replace the
generic machine ID placeholder with the canonical UUID placeholder used
elsewhere, so both `nico-admin-cli --output json machine boot-interfaces` and
`nico-admin-cli --output yaml machine boot-interfaces` show the same
copy-pasteable UUID value instead of `<MACHINE_ID>`.

Source: Coding guidelines

🧹 Nitpick comments (2)
crates/api-core/src/handlers/machine_boot_interfaces.rs (1)

132-149: 📐 Maintainability & Code Quality | 🔵 Trivial

Use the stable Display output for network_segment_type.

NetworkSegmentType already implements Display, so format!("{t:?}") / format!("{:?}", p.expected_network_segment_type) bakes in the derived Debug spelling instead of the canonical segment label. Use to_string() (or format!("{t}")) in both mappings so the RPC payload stays consistent with the model’s text form.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/handlers/machine_boot_interfaces.rs` around lines 132 -
149, The `machine_boot_interfaces` mappings are using `Debug` formatting for
`network_segment_type` instead of the stable `Display` text from
`NetworkSegmentType`. Update both the `machine_interfaces` and
`predicted_interfaces` conversions in `machine_boot_interfaces` to call
`to_string()` or `format!("{t}")` on the segment type values, keeping the RPC
payload aligned with the canonical label.
crates/api-db/src/retained_boot_interface.rs (1)

105-116: 📐 Maintainability & Code Quality | 🔵 Trivial

Accept impl DbReader<'_> for this read-only lookup. This is a single SELECT; using the shared read-only connection abstraction keeps the API consistent with the rest of crates/api-db and lets callers use &PgPool or txn.as_mut() without forcing a transaction-specific signature.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-db/src/retained_boot_interface.rs` around lines 105 - 116, The
read-only lookup in find_records_by_macs is too narrowly typed to PgConnection,
which prevents using the shared database reader abstraction. Update the function
signature to accept impl DbReader<'_> instead of a transaction-specific
connection, and keep the existing SELECT/query_as logic unchanged so callers can
pass either a pool or txn.as_mut() consistently with the rest of crates/api-db.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@crates/admin-cli/src/machine/boot_interfaces/args.rs`:
- Around line 29-30: Update the example commands in the boot-interfaces args
help text to replace the generic machine ID placeholder with the canonical UUID
placeholder used elsewhere, so both `nico-admin-cli --output json machine
boot-interfaces` and `nico-admin-cli --output yaml machine boot-interfaces` show
the same copy-pasteable UUID value instead of `<MACHINE_ID>`.

---

Nitpick comments:
In `@crates/api-core/src/handlers/machine_boot_interfaces.rs`:
- Around line 132-149: The `machine_boot_interfaces` mappings are using `Debug`
formatting for `network_segment_type` instead of the stable `Display` text from
`NetworkSegmentType`. Update both the `machine_interfaces` and
`predicted_interfaces` conversions in `machine_boot_interfaces` to call
`to_string()` or `format!("{t}")` on the segment type values, keeping the RPC
payload aligned with the canonical label.

In `@crates/api-db/src/retained_boot_interface.rs`:
- Around line 105-116: The read-only lookup in find_records_by_macs is too
narrowly typed to PgConnection, which prevents using the shared database reader
abstraction. Update the function signature to accept impl DbReader<'_> instead
of a transaction-specific connection, and keep the existing SELECT/query_as
logic unchanged so callers can pass either a pool or txn.as_mut() consistently
with the rest of crates/api-db.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c29c3d25-c572-4865-ad92-5ecdea49e74a

📥 Commits

Reviewing files that changed from the base of the PR and between 0d4e13e and c88ebe3.

📒 Files selected for processing (14)
  • crates/admin-cli/src/machine/boot_interfaces/args.rs
  • crates/admin-cli/src/machine/boot_interfaces/cmd.rs
  • crates/admin-cli/src/machine/boot_interfaces/mod.rs
  • crates/admin-cli/src/machine/mod.rs
  • crates/admin-cli/src/rpc.rs
  • crates/api-core/src/api.rs
  • crates/api-core/src/auth/internal_rbac_rules.rs
  • crates/api-core/src/handlers/machine_boot_interfaces.rs
  • crates/api-core/src/handlers/mod.rs
  • crates/api-core/src/tests/machine_boot_interfaces.rs
  • crates/api-core/src/tests/mod.rs
  • crates/api-db/src/retained_boot_interface.rs
  • crates/rpc/build.rs
  • crates/rpc/proto/forge.proto

@chet

chet commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai full_review

@chet chet force-pushed the machine-boot-interfaces-cli branch from c88ebe3 to 79d0851 Compare June 24, 2026 23:13
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

@chet Re-running a full review of this PR now.

✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
crates/api-core/src/handlers/machine_boot_interfaces.rs (1)

68-72: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Malformed BMC IPs are silently dropped from the troubleshooting view.

ip.parse().ok() discards any unparseable address, so store 3 (explored endpoints) would silently omit data precisely when the underlying record is malformed — the kind of anomaly this diagnostic command exists to expose. Consider emitting a structured tracing::warn!(%ip, ...) for the dropped entries so operators are aware the projection is incomplete.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/handlers/machine_boot_interfaces.rs` around lines 68 -
72, The BMC IP projection in machine_boot_interfaces is silently dropping
malformed addresses because the `bmc_ips` collection path uses `ip.parse().ok()`
without any visibility. Update the parsing flow in `bmc_ips` to keep filtering
invalid entries but add a `tracing::warn!(%ip, ...)` (or equivalent structured
warning) when parsing fails so the troubleshooting view is explicitly incomplete
rather than silently omitting bad records.
crates/api-db/src/retained_boot_interface.rs (1)

105-108: 📐 Maintainability & Code Quality | 🔵 Trivial

crates/api-db/src/retained_boot_interface.rs:105 — Prefer impl DbReader<'_> for this single-query read path. This helper is pure read-only and can stay callable from txn.as_mut() while also supporting pool-backed readers (&PgPool or PgPoolReader) without forcing a transaction wrapper at read-only call sites.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-db/src/retained_boot_interface.rs` around lines 105 - 108, The
read helper `find_records_by_macs` is currently tied to `PgConnection`, but this
is a pure single-query read path and should accept a generic reader instead.
Update the function signature to use `impl DbReader<'_>` so it remains callable
from `txn.as_mut()` while also working with pool-backed readers like `&PgPool`
or `PgPoolReader`, and keep the query implementation unchanged aside from
adapting the parameter type.

Source: Path instructions

crates/rpc/proto/forge.proto (1)

8767-8803: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick win

Consider optional string over empty-string sentinels for absent values.

MachineInterfaceBootInterface.boot_interface_id (Line 8772), ExploredBootInterface.boot_interface_mac/boot_interface_id (Lines 8792-8793), and RetainedBootInterface.boot_interface_id (Line 8801) encode "not captured yet" as an empty string, whereas network_segment_type (Lines 8774, 8784) already uses optional. For a troubleshooting view the distinction between absent and empty is precisely the signal an operator needs, and the handler currently collapses both via unwrap_or_default(). Proto3 optional is already enabled (--experimental_allow_proto3_optional), so making these fields optional would remove the ambiguity without a wire-compatibility cost.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/rpc/proto/forge.proto` around lines 8767 - 8803, The boot-interface
proto messages are using empty-string sentinels for values that should be
distinguishable as absent, so update the affected fields in
MachineInterfaceBootInterface, ExploredBootInterface, and RetainedBootInterface
to use optional string where appropriate. Keep the schema aligned with the
existing optional pattern used by network_segment_type, then update any
read/write paths and the troubleshooting handler that currently rely on
unwrap_or_default() so they preserve null versus empty instead of collapsing
both.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/api-core/src/handlers/machine_boot_interfaces.rs`:
- Around line 36-40: The doc comment in machine_boot_interfaces should not claim
a consistent snapshot guarantee because begin_with_location only starts a normal
transaction via db::Transaction::begin_with_location and pool.begin(), which
does not change isolation level. Reword the comment to say the stores are read
within a single transaction, or if the code truly needs a consistent snapshot,
update the transaction setup around begin_with_location to use REPEATABLE READ
before the reads performed by pick_boot_interface.
- Around line 138-147: The `machine_boot_interfaces` handler is serializing
`NetworkSegmentType` using `Debug`, which exposes Rust variant names in the API
contract. Update the mapping in `machine_interfaces` and `predicted_interfaces`
to use an explicit stable wire representation, or switch both fields to the
proto enum, so the response does not depend on `format!("{:?}", ...)` in
`rpc::PredictedBootInterface` and the machine interface mapping logic.

---

Nitpick comments:
In `@crates/api-core/src/handlers/machine_boot_interfaces.rs`:
- Around line 68-72: The BMC IP projection in machine_boot_interfaces is
silently dropping malformed addresses because the `bmc_ips` collection path uses
`ip.parse().ok()` without any visibility. Update the parsing flow in `bmc_ips`
to keep filtering invalid entries but add a `tracing::warn!(%ip, ...)` (or
equivalent structured warning) when parsing fails so the troubleshooting view is
explicitly incomplete rather than silently omitting bad records.

In `@crates/api-db/src/retained_boot_interface.rs`:
- Around line 105-108: The read helper `find_records_by_macs` is currently tied
to `PgConnection`, but this is a pure single-query read path and should accept a
generic reader instead. Update the function signature to use `impl DbReader<'_>`
so it remains callable from `txn.as_mut()` while also working with pool-backed
readers like `&PgPool` or `PgPoolReader`, and keep the query implementation
unchanged aside from adapting the parameter type.

In `@crates/rpc/proto/forge.proto`:
- Around line 8767-8803: The boot-interface proto messages are using
empty-string sentinels for values that should be distinguishable as absent, so
update the affected fields in MachineInterfaceBootInterface,
ExploredBootInterface, and RetainedBootInterface to use optional string where
appropriate. Keep the schema aligned with the existing optional pattern used by
network_segment_type, then update any read/write paths and the troubleshooting
handler that currently rely on unwrap_or_default() so they preserve null versus
empty instead of collapsing both.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6baed1d8-8d59-42e6-97fb-d8c4d41b15e9

📥 Commits

Reviewing files that changed from the base of the PR and between 0d4e13e and 79d0851.

📒 Files selected for processing (14)
  • crates/admin-cli/src/machine/boot_interfaces/args.rs
  • crates/admin-cli/src/machine/boot_interfaces/cmd.rs
  • crates/admin-cli/src/machine/boot_interfaces/mod.rs
  • crates/admin-cli/src/machine/mod.rs
  • crates/admin-cli/src/rpc.rs
  • crates/api-core/src/api.rs
  • crates/api-core/src/auth/internal_rbac_rules.rs
  • crates/api-core/src/handlers/machine_boot_interfaces.rs
  • crates/api-core/src/handlers/mod.rs
  • crates/api-core/src/tests/machine_boot_interfaces.rs
  • crates/api-core/src/tests/mod.rs
  • crates/api-db/src/retained_boot_interface.rs
  • crates/rpc/build.rs
  • crates/rpc/proto/forge.proto

Comment thread crates/api-core/src/handlers/machine_boot_interfaces.rs
Comment thread crates/api-core/src/handlers/machine_boot_interfaces.rs Outdated
@github-actions

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
boot-artifacts-aarch64 3 0 0 3 0 0
boot-artifacts-x86_64 3 0 0 3 0 0
forge-admin-cli-x86_64 265 6 24 98 7 130
machine-validation-runner 717 32 188 267 36 194
machine_validation 717 32 188 267 36 194
machine_validation-aarch64 717 32 188 267 36 194
nvmetal-carbide 717 32 188 267 36 194
TOTAL 3139 134 776 1172 151 906

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@chet

chet commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai full_review

@chet chet force-pushed the machine-boot-interfaces-cli branch from 79d0851 to 907dc51 Compare June 25, 2026 00:10
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

@chet Re-running a full review of this PR now.

✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/api-db/src/retained_boot_interface.rs (1)

105-108: 📐 Maintainability & Code Quality | 🔵 Trivial

Accept impl DbReader<'_> here.
find_records_by_macs is a single read-only query, so it should follow the other leaf lookups in crates/api-db and take impl DbReader<'_> instead of &mut PgConnection. That keeps the helper usable from both transactions and pools without narrowing it to one connection type.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-db/src/retained_boot_interface.rs` around lines 105 - 108, Update
find_records_by_macs in retained_boot_interface.rs to take impl DbReader<'_>
instead of &mut PgConnection, matching the other read-only lookup helpers in
crates/api-db. Adjust the function signature and any call sites inside the body
so the query runs through the DbReader interface, keeping the helper usable from
both pooled readers and transactions without tying it to PgConnection.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/api-core/src/tests/machine_boot_interfaces.rs`:
- Around line 156-161: The effective_boot_interface_id assertion is currently
gated by an Option check, so the contract can be skipped if the primary row is
missing a boot_interface_id. In machine_boot_interfaces.rs, tighten the test
around primary_boot_id by asserting that the primary row must have a value
before comparing it to report.effective_boot_interface_id, using the existing
primary_boot_id and report symbols so the test fails loudly if that precondition
regresses.

---

Nitpick comments:
In `@crates/api-db/src/retained_boot_interface.rs`:
- Around line 105-108: Update find_records_by_macs in retained_boot_interface.rs
to take impl DbReader<'_> instead of &mut PgConnection, matching the other
read-only lookup helpers in crates/api-db. Adjust the function signature and any
call sites inside the body so the query runs through the DbReader interface,
keeping the helper usable from both pooled readers and transactions without
tying it to PgConnection.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e7fd58f8-3be9-4bf3-a4be-b8bfcaa1f3b0

📥 Commits

Reviewing files that changed from the base of the PR and between 0d4e13e and 907dc51.

📒 Files selected for processing (14)
  • crates/admin-cli/src/machine/boot_interfaces/args.rs
  • crates/admin-cli/src/machine/boot_interfaces/cmd.rs
  • crates/admin-cli/src/machine/boot_interfaces/mod.rs
  • crates/admin-cli/src/machine/mod.rs
  • crates/admin-cli/src/rpc.rs
  • crates/api-core/src/api.rs
  • crates/api-core/src/auth/internal_rbac_rules.rs
  • crates/api-core/src/handlers/machine_boot_interfaces.rs
  • crates/api-core/src/handlers/mod.rs
  • crates/api-core/src/tests/machine_boot_interfaces.rs
  • crates/api-core/src/tests/mod.rs
  • crates/api-db/src/retained_boot_interface.rs
  • crates/rpc/build.rs
  • crates/rpc/proto/forge.proto

Comment thread crates/api-core/src/tests/machine_boot_interfaces.rs Outdated
@chet

chet commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai full_review

@chet chet force-pushed the machine-boot-interfaces-cli branch from 907dc51 to b10f5f9 Compare June 25, 2026 00:26
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

@chet Re-running a full review of this PR now.

✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
crates/api-core/src/tests/machine_boot_interfaces.rs (1)

156-160: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

The effective-id assertion still defaults the precondition away.

primary_boot_id.unwrap_or_default() allows the contract to pass silently if the fixture's primary row regresses to None and the response also returns an empty id. For a DPU host whose primary is expected to carry a boot-interface id, assert that precondition explicitly so a regression fails loudly.

💚 Suggested tightening
-    assert_eq!(
-        report.effective_boot_interface_id,
-        primary_boot_id.unwrap_or_default(),
-        "the effective boot interface id is the primary row's captured id (empty when it has none)"
-    );
+    let boot_id = primary_boot_id.expect("the owned primary should carry a boot interface id");
+    assert_eq!(
+        report.effective_boot_interface_id, boot_id,
+        "the effective boot interface id is the primary row's captured id"
+    );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-core/src/tests/machine_boot_interfaces.rs` around lines 156 - 160,
The assertion in machine_boot_interfaces::tests should not hide a missing
primary boot ID by using unwrap_or_default on primary_boot_id. Make the
precondition explicit in the test by asserting that the primary row’s captured
boot-interface id is present for the DPU host before comparing it to
report.effective_boot_interface_id, so the fixture regresses loudly if that ID
becomes None.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@crates/api-core/src/tests/machine_boot_interfaces.rs`:
- Around line 156-160: The assertion in machine_boot_interfaces::tests should
not hide a missing primary boot ID by using unwrap_or_default on
primary_boot_id. Make the precondition explicit in the test by asserting that
the primary row’s captured boot-interface id is present for the DPU host before
comparing it to report.effective_boot_interface_id, so the fixture regresses
loudly if that ID becomes None.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bab15065-9895-4807-b3e6-f6c4ede68c37

📥 Commits

Reviewing files that changed from the base of the PR and between 69a490d and b10f5f9.

📒 Files selected for processing (14)
  • crates/admin-cli/src/machine/boot_interfaces/args.rs
  • crates/admin-cli/src/machine/boot_interfaces/cmd.rs
  • crates/admin-cli/src/machine/boot_interfaces/mod.rs
  • crates/admin-cli/src/machine/mod.rs
  • crates/admin-cli/src/rpc.rs
  • crates/api-core/src/api.rs
  • crates/api-core/src/auth/internal_rbac_rules.rs
  • crates/api-core/src/handlers/machine_boot_interfaces.rs
  • crates/api-core/src/handlers/mod.rs
  • crates/api-core/src/tests/machine_boot_interfaces.rs
  • crates/api-core/src/tests/mod.rs
  • crates/api-db/src/retained_boot_interface.rs
  • crates/rpc/build.rs
  • crates/rpc/proto/forge.proto

@chet

chet commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai full_review

@chet chet force-pushed the machine-boot-interfaces-cli branch from b10f5f9 to 318b013 Compare June 25, 2026 02:00
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

@chet Re-running a full review of this PR now.

✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/api-core/src/handlers/machine_boot_interfaces.rs`:
- Around line 82-95: The retained boot-interface lookup in
machine_boot_interfaces::... is only collecting MACs from owned_interfaces and
predicted_interfaces, so retained rows reachable only through explored_endpoints
are missed. Update the macs հավաք construction to also include
explored_endpoints.iter().filter_map(|e| e.boot_interface_mac) before calling
db::retained_boot_interface::find_records_by_macs, keeping the BTreeSet deduping
behavior intact.

In `@crates/api-core/src/tests/machine_boot_interfaces.rs`:
- Around line 106-156: The test in machine_boot_interfaces should explicitly
verify the explored-endpoints store so the new admin-cli behavior is covered.
Add an assertion against report.explored_endpoints using the row seeded by
api_fixtures::site_explorer::new_host, similar to the existing checks for
machine_interfaces, predicted_interfaces, and retained_interfaces. Make sure the
assertion confirms the expected explored endpoint entry is present and, if
applicable, that its key fields match the seeded values.

In `@crates/api-db/src/retained_boot_interface.rs`:
- Around line 106-117: The query in find_records_by_macs currently returns
retained_boot_interfaces rows in unspecified order, so the output can reshuffle
across calls. Update the SQL used by sqlx::query_as in find_records_by_macs to
include an explicit deterministic ORDER BY on the selected columns, or sort the
Vec<RetainedBootInterfaceRecord> before returning it, so CLI table/JSON/YAML
output stays stable. Keep the fix localized to find_records_by_macs and preserve
the existing DatabaseError::query error handling.

In `@crates/rpc/proto/forge.proto`:
- Around line 8767-8820: The boot-interface RPC contract is encoding absent
values as empty strings instead of field presence, so update the protobuf
messages in MachineInterfaceBootInterface, PredictedBootInterface,
ExploredBootInterface, RetainedBootInterface, and
GetMachineBootInterfacesResponse to use optional string for semantically missing
data like boot_interface_id, boot_interface_mac, effective_boot_interface_mac,
and effective_boot_interface_id. Adjust the server-side population logic to
return None for missing values rather than sentinel empties, and keep the CLI
consumer in crates/admin-cli/src/machine/boot_interfaces/cmd.rs aligned so it no
longer needs to normalize these fields with non_empty().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dbcf7bbc-170b-4af8-a13e-48f30dac3d43

📥 Commits

Reviewing files that changed from the base of the PR and between e7781e0 and 318b013.

📒 Files selected for processing (14)
  • crates/admin-cli/src/machine/boot_interfaces/args.rs
  • crates/admin-cli/src/machine/boot_interfaces/cmd.rs
  • crates/admin-cli/src/machine/boot_interfaces/mod.rs
  • crates/admin-cli/src/machine/mod.rs
  • crates/admin-cli/src/rpc.rs
  • crates/api-core/src/api.rs
  • crates/api-core/src/auth/internal_rbac_rules.rs
  • crates/api-core/src/handlers/machine_boot_interfaces.rs
  • crates/api-core/src/handlers/mod.rs
  • crates/api-core/src/tests/machine_boot_interfaces.rs
  • crates/api-core/src/tests/mod.rs
  • crates/api-db/src/retained_boot_interface.rs
  • crates/rpc/build.rs
  • crates/rpc/proto/forge.proto

Comment thread crates/api-core/src/handlers/machine_boot_interfaces.rs
Comment thread crates/api-core/src/tests/machine_boot_interfaces.rs
Comment thread crates/api-db/src/retained_boot_interface.rs
Comment thread crates/rpc/proto/forge.proto Outdated
…lifecycle

We formalized boot-interface management around a single MachineBootInterface and
now track it across the machine lifecycle, but that view lives in four stores --
the owned machine_interfaces rows, predicted_machine_interfaces, the
explored_endpoints default, and the retained_boot_interfaces pairs we keep across
a delete and re-ingest. There was no one place to see what each store believes a
machine's boot interface is.

Add `admin-cli machine boot-interfaces <machine-id>`: a read-only command that
gathers all four stores for one machine and prints them together as an ASCII
table, JSON, or YAML, alongside the effective boot interface the system would
select (pick_boot_interface) and a flag for when the stores disagree.

- New read-only GetMachineBootInterfaces Forge RPC + handler that gathers the four
  stores in one read transaction; the report messages include boot_interface_id
  explicitly (the existing MachineInterface message omits it).
- A non-consuming, non-window-filtered retained_boot_interfaces read so the
  troubleshooting view surfaces stale records that the window-aware reads hide.
- The machine boot-interfaces subcommand with the three renderers, the
  effective-pick summary, and the divergence flag.

Tests cover the three renderers and api-core integration tests that seed all four
stores and assert the per-store entries, the effective pick, and divergence.

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet chet force-pushed the machine-boot-interfaces-cli branch from 318b013 to 0f9e155 Compare June 25, 2026 05:39
@chet chet enabled auto-merge (squash) June 25, 2026 06:04
@chet chet merged commit 790a82d into NVIDIA:main Jun 25, 2026
58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants