fix(dhcp): temporarily block DHCP lease expiry handling.#2877
fix(dhcp): temporarily block DHCP lease expiry handling.#2877abvarshney-nv wants to merge 2 commits into
Conversation
Summary by CodeRabbit
WalkthroughThe PR adds a feature-disabled DHCP expiry status, gates ChangesDHCP lease expiration feature gate
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/api-core/src/tests/dhcp_lease_expiration.rs (1)
160-169: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winAssert that blocked expiry preserves the same IP.
After switching this path to
NotHandled, the important invariant is that rediscover sees the existing allocation still attached. The current follow-up only checks for a non-empty address, so it would still pass if rediscover assigned a different IP. Please pinresponse2.address == original_ip(or verify the DB row directly) and rename the test to match the blocked behavior. As per coding guidelines, “Verification should exercise the behavior that changed” and “Add or update focused tests for ... API contracts”; as per path instructions, STYLE_GUIDE testing should cover observable input→output behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/api-core/src/tests/dhcp_lease_expiration.rs` around lines 160 - 169, The blocked lease-expiry test is too weak because it only checks that rediscover returns an address, not that it preserves the original allocation. Update the test around expire_dhcp_lease and rediscover to assert response2.address matches original_ip, or verify the database row directly, so the changed NotHandled behavior is exercised. Also rename the test to reflect the blocked-expiry behavior and keep the assertion focused on the observable input→output contract.Sources: Coding guidelines, Path instructions
🧹 Nitpick comments (2)
crates/api-core/src/dhcp/expire.rs (2)
58-58: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winLog the blocked lease with structured fields.
This is now the steady-state outcome, so the bare string loses the IP/MAC needed to correlate skipped expirations with later DHCP/DNS state. Prefer something like
tracing::info!(%ip_address, ?mac_address, "Expired DHCP lease handling blocked");. As per coding guidelines, “All services should emit logs in 'logfmt' syntax” and “prefer placing common fields as attributes passed to tracing functions”; as per path instructions,crates/**/*.rsshould use structured tracing fields.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/api-core/src/dhcp/expire.rs` at line 58, The current `tracing::info!` in `expire.rs` logs only a bare message, so the blocked DHCP lease event cannot be correlated with IP/MAC state. Update the logging in the DHCP lease expiration handling to use structured tracing fields on the relevant variables already in scope, such as `ip_address` and `mac_address`, and keep the message descriptive like the existing `tracing::info!` call.Sources: Coding guidelines, Path instructions
42-69: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick winMove the blocked return ahead of
txn_begin().After
ip_address.parse(), one ofis_ipv4()/is_ipv6()is always true, so Lines 49-69 run for every request. That makes the delete/sync path dead for now and turns each expiry callback into aBEGIN+ lookup +ROLLBACKcycle even though the blocked path is read-only. Make the block explicit before opening a write transaction, and use a read-only lookup only to chooseNotFoundvsNotHandled. As per path instructions,crates/api*/**changes should be reviewed for transaction safety.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/api-core/src/dhcp/expire.rs` around lines 42 - 69, The blocked DHCP expiry path in expire_dhcp_lease currently runs after txn_begin() and always returns because the IP is already validated, so move the early return logic before opening a write transaction. In expire_dhcp_lease, keep only a read-only lookup of db::machine_interface::find_by_ip to decide between NotFound and NotHandled, and avoid starting/rolling back txn when the lease handling is blocked.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/rpc/proto/forge.proto`:
- Line 3931: The flow proto mirror is out of sync with the source enum
definition, so add the missing EXPIRE_DHCP_LEASE_STATUS_NOT_HANDLED value to the
corresponding enum in nico.proto. Keep the enum ordering and numeric value
aligned with the matching enum in forge.proto so the generated flow clients
remain consistent with the source proto.
---
Outside diff comments:
In `@crates/api-core/src/tests/dhcp_lease_expiration.rs`:
- Around line 160-169: The blocked lease-expiry test is too weak because it only
checks that rediscover returns an address, not that it preserves the original
allocation. Update the test around expire_dhcp_lease and rediscover to assert
response2.address matches original_ip, or verify the database row directly, so
the changed NotHandled behavior is exercised. Also rename the test to reflect
the blocked-expiry behavior and keep the assertion focused on the observable
input→output contract.
---
Nitpick comments:
In `@crates/api-core/src/dhcp/expire.rs`:
- Line 58: The current `tracing::info!` in `expire.rs` logs only a bare message,
so the blocked DHCP lease event cannot be correlated with IP/MAC state. Update
the logging in the DHCP lease expiration handling to use structured tracing
fields on the relevant variables already in scope, such as `ip_address` and
`mac_address`, and keep the message descriptive like the existing
`tracing::info!` call.
- Around line 42-69: The blocked DHCP expiry path in expire_dhcp_lease currently
runs after txn_begin() and always returns because the IP is already validated,
so move the early return logic before opening a write transaction. In
expire_dhcp_lease, keep only a read-only lookup of
db::machine_interface::find_by_ip to decide between NotFound and NotHandled, and
avoid starting/rolling back txn when the lease handling is blocked.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a4d71c84-707f-4775-afca-04064db703da
📒 Files selected for processing (4)
crates/api-core/src/dhcp/expire.rscrates/api-core/src/tests/dhcp_lease_expiration.rscrates/dhcp/src/lease_expiration.rscrates/rpc/proto/forge.proto
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
2196d81 to
d21c249
Compare
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-06-25 10:48:59 UTC | Commit: d21c249 |
| // to still exist, so we must do this before the delete. | ||
| let interface = db::machine_interface::find_by_ip(&mut txn, ip_address).await?; | ||
|
|
||
| if ip_address.is_ipv4() || ip_address.is_ipv6() { |
There was a problem hiding this comment.
Instead of if ip_address.is_ipv4() || ip_address.is_ipv6() {, how about we add a config bool handle_lease_expiration_callback, defaults to false, and then this just becomes:
if !handle_lease_expiration_callback
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crates/api-core/src/tests/dhcp_lease_expiration.rs (2)
436-445: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueAlign comment with the actual assertion.
The comment references a
n-<mac>placeholder while the assertion verifies the hostname starts withnoip. Update the comment to thenoipprefix to avoid confusion.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/api-core/src/tests/dhcp_lease_expiration.rs` around lines 436 - 445, The inline comment above the expiry assertion is inconsistent with the actual check in dhcp_lease_expiration; update the comment to match the hostname prefix being asserted in iface_after_expiry so it refers to the noip dormant format rather than n-<mac>. Keep the comment aligned with the assertion logic in the same test block to avoid misleading future readers.
34-64: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueRename this test to reflect its inverted assertion.
test_expire_releases_allocationnow runs against the default (disabled) environment and asserts the allocation is explicitly not released (FeatureDisabled, with both rows preserved). The name describes the opposite outcome and will mislead the next maintainer — considertest_expire_is_blocked_when_handling_disabledor similar.
[optional_refactor]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/api-core/src/tests/dhcp_lease_expiration.rs` around lines 34 - 64, The test name is misleading because the assertions in test_expire_releases_allocation now verify the disabled-path behavior instead of a successful lease expiration. Rename the test to reflect that expire_dhcp_lease is blocked by the default environment and that the allocation remains preserved with FeatureDisabled, using a name aligned with the current assertions so future readers understand the intended outcome.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/dhcp/src/lease_expiration.rs`:
- Around line 105-107: The new FeatureDisabled branch in
lease_expiration::expire_dhcp_lease is not reachable from the current tests
because MockAPIServer still returns a hard-coded
ExpireDhcpLeaseStatus::Released. Update the test setup so the mock status is
configurable, or add a focused unit test around the status-to-log/behavior
mapping in MockAPIServer and expire_dhcp_lease, so FeatureDisabled is explicitly
exercised.
---
Nitpick comments:
In `@crates/api-core/src/tests/dhcp_lease_expiration.rs`:
- Around line 436-445: The inline comment above the expiry assertion is
inconsistent with the actual check in dhcp_lease_expiration; update the comment
to match the hostname prefix being asserted in iface_after_expiry so it refers
to the noip dormant format rather than n-<mac>. Keep the comment aligned with
the assertion logic in the same test block to avoid misleading future readers.
- Around line 34-64: The test name is misleading because the assertions in
test_expire_releases_allocation now verify the disabled-path behavior instead of
a successful lease expiration. Rename the test to reflect that expire_dhcp_lease
is blocked by the default environment and that the allocation remains preserved
with FeatureDisabled, using a name aligned with the current assertions so future
readers understand the intended outcome.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8f4a93bc-6a51-4530-95f9-907d665a8395
📒 Files selected for processing (8)
crates/api-core/src/cfg/file.rscrates/api-core/src/dhcp/expire.rscrates/api-core/src/test_support/default_config.rscrates/api-core/src/tests/common/api_fixtures/mod.rscrates/api-core/src/tests/dhcp_lease_expiration.rscrates/dhcp/src/lease_expiration.rscrates/rpc/proto/forge.protorest-api/flow/internal/nicoapi/nicoproto/nico.proto
| rpc::ExpireDhcpLeaseStatus::FeatureDisabled => { | ||
| log::info!("Feature is disabled at NICo"); | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Inspect the mock API server to see whether the expire-lease status is configurable.
fd -t f 'mock_api_server' crates/dhcp -x sed -n '1,200p' {}
rg -nP 'ExpireDhcpLeaseStatus|ENDPOINT_EXPIRE_DHCP_LEASE|expire_dhcp_lease' crates/dhcp -C3Repository: NVIDIA/infra-controller
Length of output: 12786
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the lease expiration tests and any status configurability in the DHCP mock server.
sed -n '1,240p' crates/dhcp/src/lease_expiration.rs
printf '\n--- SEARCH ---\n'
rg -n 'FeatureDisabled|Released|NotFound|inject_failure|address_overrides|status:' crates/dhcp/src -C 2Repository: NVIDIA/infra-controller
Length of output: 14971
Add a test hook for FeatureDisabled
MockAPIServer still hard-codes ExpireDhcpLeaseStatus::Released, so the new branch cannot be exercised by the existing lease-expiration tests. Make the mock status configurable or add a focused unit test for the status mapping; otherwise this branch remains uncovered.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/dhcp/src/lease_expiration.rs` around lines 105 - 107, The new
FeatureDisabled branch in lease_expiration::expire_dhcp_lease is not reachable
from the current tests because MockAPIServer still returns a hard-coded
ExpireDhcpLeaseStatus::Released. Update the test setup so the mock status is
configurable, or add a focused unit test around the status-to-log/behavior
mapping in MockAPIServer and expire_dhcp_lease, so FeatureDisabled is explicitly
exercised.
Expiring a BMC IP lease causes a mismatch between machine_interface and machine_topologies, and DPF does not yet support BMC IP changes. Block all lease expiry processing until we find a proper solution of cleaning up the IPs and DPF releases the fix.
Related issues
Type of Change
Breaking Changes
Testing
Additional Notes