OSAC-1347: Add BareMetalInstance reconciler for bare-metal-fulfillment-operator#703
OSAC-1347: Add BareMetalInstance reconciler for bare-metal-fulfillment-operator#703carbonin wants to merge 2 commits into
Conversation
|
@carbonin: This pull request references OSAC-1347 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: carbonin The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Added a hold until #683 and the installer PR are merged. |
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... ⚙️ Run configurationConfiguration used: Repository: osac-project/coderabbit/.coderabbit.yaml Review profile: ASSERTIVE Plan: Enterprise Run ID: 📒 Files selected for processing (2)
WalkthroughIntroduces a new ChangesBareMetalInstance Reconciler Feature
Sequence Diagram(s)sequenceDiagram
participant Controller as Controller Command
participant FunctionBuilder
participant run as function.run
participant hubCache
participant hubK8sClient as Hub K8s Client
participant catalogClient as Catalog Client
participant bareMetalInstancesClient as gRPC BareMetalInstancesClient
Controller->>FunctionBuilder: Build() → ReconcilerFunction
Controller->>run: run(ctx, BareMetalInstance)
run->>run: clone instance
alt deletionTimestamp set
run->>hubCache: getHub(hubID)
run->>hubK8sClient: Get / Delete BareMetalInstance CR
run->>run: removeFinalizer
else provisioning / update
run->>run: addFinalizer, setDefaults, validateTenant
run->>hubCache: selectHub / getHub
run->>hubK8sClient: List by UUID label
alt CR not found
run->>hubK8sClient: Create BareMetalInstance CR
else CR exists
run->>hubK8sClient: Patch Spec
end
run->>catalogClient: Get (template ID, runStrategy)
run->>hubK8sClient: Create user-data Secret (if present)
end
run->>bareMetalInstancesClient: Update(ctx, instance, updateMask)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Suggested reviewers
Poem
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error)
✅ Passed checks (10 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can generate a title for your PR based on the changes.Add |
|
Actually leaving this as a draft until the dependencies are merged at least |
|
Installer PR osac-project/osac-installer#275 |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@go.mod`:
- Line 34: Update the golang.org/x/crypto dependency from v0.51.0 to v0.52.0 in
the go.mod file to address four published CVEs affecting the current version:
GO-2026-5013 (SSH AES-GCM decoder panic), GO-2026-5019 (FIDO/U2F security key
bypass), GO-2026-5021 (SSH knownhosts revoked status bypass), and GO-2026-5018
(RSA/DSA key DoS). Change the version number on the golang.org/x/crypto
dependency line from v0.51.0 to v0.52.0.
In
`@internal/controllers/baremetalinstance/baremetalinstance_reconciler_function_test.go`:
- Around line 113-115: The test file
baremetalinstance_reconciler_function_test.go has formatting issues detected by
gofmt that need to be corrected. Run gofmt on the entire test file to
automatically fix all formatting drift and ensure it conforms to Go formatting
standards. This will resolve the CI/lint failures related to code formatting.
In
`@internal/controllers/baremetalinstance/baremetalinstance_reconciler_function.go`:
- Around line 55-60: The file
internal/controllers/baremetalinstance/baremetalinstance_reconciler_function.go
is not properly formatted according to Go standards and requires gofmt to be
applied. Run gofmt on this file to automatically fix all formatting issues,
which will ensure the code adheres to the standard Go formatting conventions and
prevent CI/lint failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 4ab76326-93ab-4d58-aa20-8a1e6e8521d3
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (7)
go.modinternal/cmd/service/start/controller/start_controller_cmd.gointernal/controllers/baremetalinstance/baremetalinstance_reconciler_function.gointernal/controllers/baremetalinstance/baremetalinstance_reconciler_function_test.gointernal/controllers/baremetalinstance/baremetalinstance_suite_test.gointernal/kubernetes/labels/kubernetes_labels.gointernal/kubernetes/scheme/scheme.go
| poweredOn := true | ||
| spec.PoweredOn = &poweredOn |
There was a problem hiding this comment.
| poweredOn := true | |
| spec.PoweredOn = &poweredOn | |
| spec.PoweredOn = new(true) |
nit: this new thing from go 1.26 should do the trick
|
I think there are a few gaps in how this reconciler creates the BareMetalInstance CR, comparing with existing bare-metal-fulfillment-operator:
|
@adriengentil where do you feel like these should come from? We don't have any template overrides defined in the API, right?
Thanks, I'll keep an eye on this PR and update when it merges.
This feels like an implementation detail of the baremetal-fulfilment-operator. I don't think fulfillment-service should know these kinds of details, does it make sense for these two labels to be a sensible default for a new BareMetalInstance without any other labels? |
Yeah, that makes sense. I'll open up an issue to take care of it from the |
…t-operator Add a reconciler that watches BareMetalInstance objects in the fulfillment service database and creates corresponding BareMetalInstance CRs in hub Kubernetes clusters, following the same pattern as the existing ComputeInstance reconciler. The reconciler handles the full lifecycle: - Finalizer management to prevent premature deletion - Hub selection and caching - Catalog item resolution to obtain the template ID - BareMetalInstance CR creation and patching in hub clusters - Run strategy mapping (ALWAYS/HALTED to poweredOn bool) - User data Secret creation with owner references - Deletion flow with decommissioned hub handling The template ID is passed through from the catalog item's template reference, matching the ComputeInstance pattern where the admin sets a meaningful template identifier (e.g. osac.templates.gpu_host) that flows through to the CRD and ultimately to AAP extra vars. Also registers the bare-metal-fulfillment-operator CRD types in the hub scheme so the controller-runtime client can work with them. Resolves https://redhat.atlassian.net/browse/OSAC-1347 Assisted-by: Claude Code <noreply@anthropic.com> Signed-off-by: Nick Carboni <ncarboni@redhat.com>
ba0d5e0 to
8a2ec25
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@internal/controllers/baremetalinstance/baremetalinstance_reconciler_function_test.go`:
- Around line 511-512: The NestedMap call in the test ignores both the found and
err return values, then directly accesses stringData[userDataSecretKey], which
will panic if stringData is nil when the field is missing or malformed. Check
both the found and err return values from the unstructured.NestedMap call and
add appropriate assertions (such as Expect(found).To(BeTrue()) and
Expect(err).NotTo(HaveOccurred())) before attempting to access
stringData[userDataSecretKey] to ensure the test fails with a clear error
message rather than panicking.
In
`@internal/controllers/baremetalinstance/baremetalinstance_reconciler_function.go`:
- Around line 173-175: The user-data secret handling currently only creates
secrets and ignores updates, and doesn't clean up removed secrets. The logic at
lines 173-175 only sets the secret name when user data is present, leaving stale
secrets behind when user data is removed. Fix this by: (1) updating the secret
creation logic around lines 480-520 to use create-or-patch semantics instead of
returning success on AlreadyExists, so that changes to spec.userData properly
update the existing Secret, and (2) adjusting lines 173-175 to always set the
secret name (regardless of whether current user data is present) so it can be
tracked for cleanup, and adding separate logic to delete the secret when user
data is removed from the spec.
- Around line 44-46: Remove the hardcoded `defaultHostType` constant and update
the `buildSpec()` function to resolve the host type from actual data sources.
Instead of always using `"default"`, derive the host type from the template
proto (once it's modeled) or from the catalog item metadata that is already
being fetched. Add validation to ensure the resolved host type exists in the
hub's inventory before assigning it to HostType in the spec. If the host type
cannot be resolved or does not exist on the hub, fail fast and return an error
rather than emitting an unschedulable CR with a placeholder value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: ccd4e52e-021f-4f74-957c-9e3c414e006f
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (7)
go.modinternal/cmd/service/start/controller/start_controller_cmd.gointernal/controllers/baremetalinstance/baremetalinstance_reconciler_function.gointernal/controllers/baremetalinstance/baremetalinstance_reconciler_function_test.gointernal/controllers/baremetalinstance/baremetalinstance_suite_test.gointernal/kubernetes/labels/kubernetes_labels.gointernal/kubernetes/scheme/scheme.go
Set HostType to a hardcoded "default" placeholder until host type is modeled in the BareMetalInstanceTemplate proto. Build TemplateParameters as a JSON object containing sshKey (the SSH public key value) and userDataSecret (the name of the user data Secret in the hub namespace) when present. Keys use camelCase to match the existing AAP template parameter convention (imageURL, provisioningNetwork, etc.). The AAP provisioning role parses templateParameters as JSON to access these values. Assisted-by: Claude Code <noreply@anthropic.com> Signed-off-by: Nick Carboni <ncarboni@redhat.com>
|
@coderabbitai review |
Action performedReview triggered.
|
Add a reconciler that watches BareMetalInstance objects in the fulfillment service database and creates corresponding BareMetalInstance CRs in hub Kubernetes clusters, following the same pattern as the existing ComputeInstance reconciler.
The reconciler handles the full lifecycle:
The template ID is passed through from the catalog item's template reference, matching the ComputeInstance pattern where the admin sets a meaningful template identifier (e.g. osac.templates.gpu_host) that flows through to the CRD and ultimately to AAP extra vars.
Also registers the bare-metal-fulfillment-operator CRD types in the hub scheme so the controller-runtime client can work with them.
Test Steps
Prerequisites
Tested on a locally deployed osac instance:
E2E: Create flow
- Valid SSH key
- Cloud-init user data (#cloud-config\npackages:\n - curl)
- run_strategy: BARE_METAL_INSTANCE_RUN_STRATEGY_HALTED
Verified:
E2E: Update flow
Verified:
E2E: Delete flow
Verified:
Resolves https://redhat.atlassian.net/browse/OSAC-1347
Depends on #683
Assisted-by: Claude Code noreply@anthropic.com
Summary by CodeRabbit