Open
Description
/kind bug
What steps did you take and what happened:
In a “hosted control plane” setup (where the control plane runs outside of OpenStack, and only worker nodes are provisioned in OpenStack), OpenStackCluster.Status.Network can remain nil. Currently, the CAPO code in OpenStackMachineReconciler.getOrCreateMachineServer() assumes openStackCluster.Status.Network is always non-nil. This leads to a nil pointer dereference (panic) when calling:
machineServerSpec := openStackMachineSpecToOpenStackServerSpec(
&openStackMachine.Spec,
identityRef,
compute.InstanceTags(&openStackMachine.Spec, openStackCluster),
failureDomain,
userDataRef,
getManagedSecurityGroup(openStackCluster, machine),
openStackCluster.Status.Network.ID, // <- panic if .Network is nil
)
The controller then crashes, making it impossible to provision worker nodes.
- In HPC scenarios, there is no control-plane node running in OpenStack, so CAPO never populates OpenStackCluster.Status.Network.
- The machine reconciliation panics in openstackmachine_controller.go due to a nil pointer dereference on openStackCluster.Status.Network.ID.
Logs:
0116 03:44:55.377796 1 openstackmachine_controller.go:361] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="kcm-system/openstack-dev-hosted-cp-md-fcpqk-8l4q5" namespace="kcm-system" name="openstack-dev-hosted-cp-md-fcpqk-8l4q5" reconcileID="b00cfcbb-ae39-4bb9-aa87-0bcde7cb350d" openStackMachine="openstack-dev-hosted-cp-md-fcpqk-8l4q5" machine="openstack-dev-hosted-cp-md-fcpqk-8l4q5" cluster="openstack-dev-hosted-cp" openStackCluster="openstack-dev-hosted-cp"
I0116 03:44:55.378942 1 controller.go:110] "Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="kcm-system/openstack-dev-hosted-cp-md-fcpqk-8l4q5" namespace="kcm-system" name="openstack-dev-hosted-cp-md-fcpqk-8l4q5" reconcileID="b00cfcbb-ae39-4bb9-aa87-0bcde7cb350d"
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x1baafba]
goroutine 357 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:111 +0x1e5
panic({0x1dccbe0?, 0x362a670?})
/usr/local/go/src/runtime/panic.go:770 +0x132
sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).getOrCreateMachineServer(0xc00043a2a0, {0x2440550, 0xc0005b7230}, 0xc0004deb08, 0xc0006bc508, 0xc0008fa008)
/workspace/controllers/openstackmachine_controller.go:586 +0x35a
sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).reconcileMachineServer(0x24467c8?, {0x2440550?, 0xc0005b7230?}, 0xc0006d1560, 0x13?, 0x0?, 0x0?)
/workspace/controllers/openstackmachine_controller.go:544 +0x3d
sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).reconcileNormal(0xc00043a2a0, {0x2440550, 0xc0005b7230}, 0xc0006d1560, {0xc000059500, 0x22}, 0xc0004deb08, 0xc0008fa008, 0xc0006bc508)
/workspace/controllers/openstackmachine_controller.go:363 +0x178
sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).Reconcile(0xc00043a2a0, {0x2440550, 0xc0005b7230}, {{{0xc0006b5576?, 0x0?}, {0xc00059d050?, 0xc0008f1d10?}}})
/workspace/controllers/openstackmachine_controller.go:161 +0xbd8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x24467c8?, {0x2440550?, 0xc0005b7230?}, {{{0xc0006b5576?, 0xb?}, {0xc00059d050?, 0x0?}}})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0004f2160, {0x2440588, 0xc00022f810}, {0x1e96420, 0xc00003d920})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0004f2160, {0x2440588, 0xc00022f810})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:261 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 203
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:218 +0x486
What did you expect to happen:
That CAPO would handle the absence of status.network gracefully—e.g. by marking the OpenStackMachine with a condition or requeueing—rather than panicking.
Environment:
- Cluster API Provider OpenStack version (Or
git rev-parse HEAD
if manually built): - Cluster-API version:
- OpenStack version:
- Minikube/KIND version:
- Kubernetes version (use
kubectl version
): - OS (e.g. from
/etc/os-release
):
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Inbox