-
Notifications
You must be signed in to change notification settings - Fork 305
✨ Support Node Auto Placement and Node AF/AAF #3655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Bump VMOP including Node AF/AAF support - Add NodeAutoPlacement Feature Gate (cherry picked from commit 700c8ae)
(cherry picked from commit cfeb862)
Removes the extra cases for VMG creation, such that VMG is created for: 1. Multiple zones, multiple MDs with no failureDomain 2. Multiple zones, multiple MDs with failureDomain 3. Single zone, existing cluster with no failureDomain MDs Signed-off-by: Sagar Muchhal <[email protected]>
- Updates VMOP API dependency Misc VMG fixes - Use namingStrategy to calculate VM names - Use MachineDeployment names for VMG placement label - Includes all machinedeployments to generate node-pool -> zone mapping Fixes VMG webhook validation error - Adds cluster-name label to Af/AAF spec - re-adds zone topology key back to anti-aff spec Signed-off-by: Sagar Muchhal <[email protected]>
Signed-off-by: Sagar Muchhal <[email protected]>
Signed-off-by: Sagar Muchhal <[email protected]>
…gs#71) * Refine VMG controller when generate per-MD zone labels - Skip legacy already-placed VM which do not have placement info - Skip VM which do not have zone info * Apply suggestions from code review --------- Co-authored-by: Sagar Muchhal <[email protected]>
- Sync VSphereMachines during day-2 operations in VMG controller - Only wait for all intended VSphereMachines during initial Cluster creation - Use annotations in VMG for per-md-zone info Signed-off-by: Gong Zhang <[email protected]>
- Add VMG recociler unit test - Bump VMOP due to API change - Filter out VSphereMachine event except create/delete events Signed-off-by: Gong Zhang <[email protected]>
Signed-off-by: Gong Zhang <[email protected]>
Signed-off-by: Gong Zhang <[email protected]>
1160ce9 to
1cd61f9
Compare
|
/test ? |
|
@zhanggbj: The following commands are available to trigger required jobs: The following commands are available to trigger optional jobs: Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test pull-cluster-api-provider-vsphere-e2e-govmomi-blocking-main |
22604bf to
7c7c1a7
Compare
ea432ed to
dc441f6
Compare
Signed-off-by: Gong Zhang <[email protected]>
zhanggbj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All comments are addressed, including the ones missed by mistake last time.
8dabe39 to
19ebccb
Compare
Signed-off-by: Gong Zhang <[email protected]>
19ebccb to
b981168
Compare
|
/test pull-cluster-api-provider-vsphere-e2e-vcsim-supervisor-main |
42a6d73 to
3ed2e93
Compare
|
/test pull-cluster-api-provider-vsphere-e2e-govmomi-blocking-main |
|
Hey @fabriziopandini @sbueringer , For the concern about race conditions, I have implemented checks to mitigate race conditions during initial and post-placement, and also have unit teststo cover the scenarios. |
3ed2e93 to
9b1754a
Compare
|
Pushed a small change and checking UT failures. |
Signed-off-by: Gong Zhang <[email protected]>
9b1754a to
2ac47d4
Compare
|
/test pull-cluster-api-provider-vsphere-e2e-supervisor-blocking-main |
|
/test pull-cluster-api-provider-vsphere-e2e-govmomi-blocking-main |
Since Encryption Class requires API in this newer version, bump vm-operator package Signed-off-by: Gong Zhang <[email protected]>
fd72f09 to
c41abeb
Compare
Signed-off-by: Gong Zhang <[email protected]>
Signed-off-by: Gong Zhang <[email protected]>
b8c569e to
08b8304
Compare
| continue | ||
| } | ||
| if member.Placement != nil && member.Placement.Zone != "" { | ||
| log.Info(fmt.Sprintf("MachineDeployment %s has been placed to failure domanin %s", md.Name, member.Placement.Zone), "MachineDeployment", klog.KObj(&md)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| log.Info(fmt.Sprintf("MachineDeployment %s has been placed to failure domanin %s", md.Name, member.Placement.Zone), "MachineDeployment", klog.KObj(&md)) | |
| log.Info(fmt.Sprintf("MachineDeployment %s has been placed to failure domain %s", md.Name, member.Placement.Zone), "MachineDeployment", klog.KObj(&md)) |
| want bool | ||
| }{ | ||
| { | ||
| name: "Should create a VMG if all the expected VSphereMachines exists", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a unit test ensuring that a deleting vSphereMachine blocks.
I would suggest to use this test case as a setup and add an additional deleting machine
| clusterv1.ClusterNameLabel: cluster.Name, | ||
| }, | ||
| Annotations: map[string]string{ | ||
| fmt.Sprintf("%s/%s", vmoperator.ZoneAnnotationPrefix, "md3"): "zone1", // failureDomain for md3 is explicitly set by the user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add an annotation without the ZoneAnnotationPrefix and check it is preserved
| Namespace: cluster.Namespace, | ||
| Name: cluster.Name, | ||
| UID: types.UID("uid"), | ||
| Labels: map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add an label different than the ClusterNameLabel and check it is preserved
| Reason: infrav1.VSphereMachineVirtualMachineWaitingForVirtualMachineGroupV1Beta2Reason, | ||
| Message: fmt.Sprintf("Waiting for VSphereMachine's VirtualMachineGroup %s to exist", key), | ||
| }) | ||
| log.V(4).Info(fmt.Sprintf("Waiting for VirtualMachineGroup %s, requeueing", key.Name), "VirtualMachineGroup", klog.KRef(key.Namespace, key.Name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| log.V(4).Info(fmt.Sprintf("Waiting for VirtualMachineGroup %s, requeueing", key.Name), "VirtualMachineGroup", klog.KRef(key.Namespace, key.Name)) | |
| log.V(4).Info(fmt.Sprintf("Waiting for VirtualMachineGroup %s to exist, requeueing", key.Name), "VirtualMachineGroup", klog.KRef(key.Namespace, key.Name)) |
| } | ||
|
|
||
| // Set the zone label using the annotation of the per-md zone mapping from VirtualMachineGroup. | ||
| // This is for new VMs created during day-2 operations when Node Auto Placement is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // This is for new VMs created during day-2 operations when Node Auto Placement is enabled. | |
| // This is for new VMs created after initial placement decision/with a failureDomain defined by the user. |
| client.MatchingLabels{clusterv1.ClusterNameLabel: supervisorMachineCtx.Cluster.Name}); err != nil { | ||
| return false, err | ||
| } | ||
| othermMDNames := []string{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| othermMDNames := []string{} | |
| otherMDNames := []string{} |
| }) | ||
|
|
||
| // control plane machine is the machine with the control plane label set | ||
| Specify("Reconcile valid control plane Machine", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a check that the CP machine is placed in the expected zone
| vmGroup *vmoprv1.VirtualMachineGroup | ||
| ) | ||
|
|
||
| BeforeEach(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we are missing a test for VMG not created yet
|
|
||
| machineNotMember.Spec.Bootstrap.DataSecretName = &secretName | ||
|
|
||
| By("VirtualMachine is not created") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check the VM actually does not exist, not only result
What this PR does / why we need it:
Support Node Auto Placement and Node AF/AAF
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #