Skip to content

fix: normalize CSI empty-zone topology value to regional zone "0"#1735

Open
andyzhangx wants to merge 1 commit into
Azure:mainfrom
andyzhangx:feat/normalize-csi-zone-value
Open

fix: normalize CSI empty-zone topology value to regional zone "0"#1735
andyzhangx wants to merge 1 commit into
Azure:mainfrom
andyzhangx:feat/normalize-csi-zone-value

Conversation

@andyzhangx

Copy link
Copy Markdown

Register a NormalizedLabelValues mapping so that when the Azure Disk CSI topology key (topology.disk.csi.azure.com/zone) is normalized to the well-known topology key (topology.kubernetes.io/zone), the empty string value ("") used by the CSI driver for non-zonal disks is translated to "0" (the fault domain value cloud-provider-azure assigns to regional VMs).

Problem

When the Azure Disk CSI driver creates a PV for a non-zonal (regional) disk, it sets:

nodeAffinity:
  required:
    nodeSelectorTerms:
    - matchExpressions:
      - key: topology.disk.csi.azure.com/zone
        operator: In
        values:
        - ""

Karpenter normalizes this key to topology.kubernetes.io/zone but preserves the value "". However, regional nodes in Karpenter offerings use zone value "0" (from cloud-provider-azure). This mismatch causes "" ∉ {"0"} → no offering matches → pods with non-zonal PVCs remain permanently pending.

Fix

Add value normalization in init() using the new NormalizedLabelValues API:

karpv1.NormalizedLabelValues[corev1.LabelTopologyZone] = map[string]string{"": zones.Regional}

Dependencies

Register a NormalizedLabelValues mapping so that when the Azure Disk CSI
topology key (topology.disk.csi.azure.com/zone) is normalized to the
well-known topology key (topology.kubernetes.io/zone), the empty string
value used by the CSI driver for non-zonal disks is translated to "0"
(the fault domain value cloud-provider-azure assigns to regional VMs).

Without this translation, PVs with topology.disk.csi.azure.com/zone=""
are normalized to topology.kubernetes.io/zone="" which does not match
any Karpenter offering (regional offerings use zone="0"), causing pods
with non-zonal PVCs to remain permanently pending.

Depends on: kubernetes-sigs/karpenter#3104 (NormalizedLabelValues support)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant