-
Notifications
You must be signed in to change notification settings - Fork 283
🐛 Fix OpenStackServer reconciliation stuck when cluster is unpaused #2833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: EmilienM The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/ok-to-test |
| // Don't handle deleted clusters | ||
| if !c.DeletionTimestamp.IsZero() { | ||
| log.V(4).Info("Cluster has a deletion timestamp, skipping mapping.") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to clean up if it is deleting? I am thinking that there could be a situation where the user paused the cluster, then deleted (forgetting to unpause first), then unpause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, to address this edge case, here's what I did:
I've moved the pause check after the deletion timestamp check (similar to OpenStackCluster controller). Now the reconciliation flow is:
- Check if server is being deleted -> proceed with deletion (regardless of pause state)
- If not deleted, check if cluster is paused -> skip reconciliation
- Otherwise, proceed normally
This ensures deletion always proceeds even if the cluster was paused when deletion started, then later unpaused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lentzi90 ptal ^^
Signed-off-by: Bharath Nallapeta <[email protected]>
dd01cc9 to
828af4f
Compare
|
I am a bit worried that this will break clusterctl move. We will probably need to handle all these cases:
The full test is testing clusterctl move, so let's check |
|
@bnallapeta: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What this PR does / why we need it:
When a cluster is paused (e.g., during a pivot operation), OpenStackServer resources stop reconciling. However, when the cluster is unpaused, they don't resume because the controller doesn't watch for cluster pause/unpause events.
This PR adds a watch on Cluster resources so OpenStackServers are re-queued when their parent cluster transitions from paused to unpaused state.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #2824
TODOs:
/hold