Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After node restart multus fails to create interface #1361

Open
EdwardCooke opened this issue Nov 25, 2024 · 1 comment
Open

After node restart multus fails to create interface #1361

EdwardCooke opened this issue Nov 25, 2024 · 1 comment

Comments

@EdwardCooke
Copy link

What happend:
After node reboot multus doesn't call the ovs CNI to create the interface in a VM created by kubevirt

What you expected to happen:
Interface to be created

How to reproduce it (as minimally and precisely as possible):
Reboot the node

Anything else we need to know?:
I installed multus using the cluster network addons. After restarting the multus pods everything started working again.
The network addons config file:

apiVersion: networkaddonsoperator.network.kubevirt.io/v1
kind: NetworkAddonsConfig
metadata:
  name: cluster
spec:
  imagePullPolicy: IfNotPresent
  multus: {}
  ovs: {}

Environment:
kubeadm 1.31, ubuntu 24.04, cilium is the primary CNI

  • Multus version
    image path and image ID (from 'docker images') ghcr.io/k8snetworkplumbingwg/multus-cni@sha256:c8bfe5bad3b5371a5677feb9e8e162da91b61bcac409c244f6f1b18c801ad006
  • Kubernetes version (use kubectl version):
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.3
  • Primary CNI for Kubernetes cluster: Cilium
  • OS (e.g. from /etc/os-release):
PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
  • File of '/etc/cni/net.d/'
file: 00-multus.conf
{"cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","type":"multus-shim"}
file: 05-cilium.conflist
{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "plugins": [
    {
       "type": "cilium-cni",
       "enable-debug": false,
       "log-file": "/var/run/cilium/cilium-cni.log"
    }
  ]
}
  • File of '/etc/cni/multus/net.d': Non existent
  • NetworkAttachment info (use kubectl get net-attach-def -o yaml)
apiVersion: v1
items:
- apiVersion: k8s.cni.cncf.io/v1
  kind: NetworkAttachmentDefinition
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"k8s.cni.cncf.io/v1","kind":"NetworkAttachmentDefinition","metadata":{"annotations":{},"name":"kube-test","namespace":"virtual-machines"},"spec":{"config":"{\n  \"cniVersion\": \"0.3.1\",\n  \"type\": \"ovs\",\n  \"bridge\": \"br0\",\n  \"vlan\": 11\n}\n"}}
    creationTimestamp: "2024-11-22T05:44:45Z"
    generation: 2
    name: kube-test
    namespace: virtual-machines
    resourceVersion: "1025311"
    uid: 36992f9b-7e88-47ff-813d-1c8cc20536a9
  spec:
    config: |
      {
        "cniVersion": "0.4.0",
        "type": "ovs",
        "bridge": "br0",
        "vlan": 11
      }
kind: List
metadata:
  resourceVersion: ""
  • Target pod yaml info (with annotation, use kubectl get pod <podname> -o yaml)
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: ubuntu
  namespace: virtual-machines
spec:
  runStrategy: Always
  template:
    metadata:
      creationTimestamp: null
    spec:
      nodeSelector:
        kubernetes.io/hostname: kubevirt1-01
      domain:
        devices:
          disks:
          - disk:
              bus: virtio
            name: containerdisk
          - disk:
              bus: virtio
            name: cloudinit
          rng: {}
          interfaces:
          # - bridge: {}
          #   name: default
          - bridge: {}
            name: multus
        resources:
          requests:
            memory: 1Gi
      networks:
      # - name: default
      #   pod: {}
      - name: multus
        multus:
          networkName: kube-test
      terminationGracePeriodSeconds: 180
      volumes:
      - containerDisk:
          image: quay.io/containerdisks/ubuntu:24.04
        name: containerdisk
      - cloudInitNoCloud:
          networkData: |
            ethernets:
              enp1s0:
                dhcp4: false
                dhcp6: false
                addresses:
                - 10.3.0.2/24
                gateway4: 10.3.0.1
                nameservers:
                  addresses:
                  - 10.0.0.2
                  - 10.0.0.3
                  search:
                  - cookes.io
            version: 2
          userData: |-
            #cloud-config
            # The default username is: ubuntu
            passwd: $6$ZCyTJ6px$wRmt8SvGuMLr2GiFnGcHVD/viAVADapuwsUlhtYRw2c/nwHOT3KnTKRbPjGy8by0bj5bQn8U7scPn.jLpqL.h/=
            ssh_authorized_keys:
            - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIME7wWH/vuPP8EfWufGy47HR/aTRK1g4anbBEEUw8DPE veccsolutions\\edward@desktop
        name: cloudinit
  • Other log outputs (if you use multus logging)
    Multus logs:
2024-11-25T16:43:41Z [verbose] multus-daemon started
2024-11-25T16:43:41Z [verbose] server configured with chroot: /hostroot
2024-11-25T16:43:41Z [verbose] Filtering pod watch for node "kubevirt1-01"
2024-11-25T16:44:01Z [error] failed to sync pod informer cache
W1125 16:44:02.369229    4366 reflector.go:539] k8s.io/client-go/informers/factory.go:159: failed to list *v1.Pod: Get "https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dkubevirt1-01&limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection timed out
W1125 16:44:02.369234    4366 reflector.go:539] github.com/k8snetworkplumbingwg/network-attachment-definition-client/pkg/client/informers/externalversions/factory.go:117: failed to list *v1.NetworkAttachmentDefinition: Get "https://10.96.0.1:443/apis/k8s.cni.cncf.io/v1/network-attachment-definitions?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection timed out
I1125 16:44:02.369472    4366 trace.go:236] Trace[1744452170]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:159 (25-Nov-2024 16:43:41.116) (total time: 21252ms):
Trace[1744452170]: ---"Objects listed" error:Get "https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dkubevirt1-01&limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection timed out 21252ms (16:44:02.369)
Trace[1744452170]: [21.252982142s] [21.252982142s] END
I1125 16:44:02.369474    4366 trace.go:236] Trace[866754195]: "Reflector ListAndWatch" name:github.com/k8snetworkplumbingwg/network-attachment-definition-client/pkg/client/informers/externalversions/factory.go:117 (25-Nov-2024 16:43:41.116) (total time: 21252ms):
Trace[866754195]: ---"Objects listed" error:Get "https://10.96.0.1:443/apis/k8s.cni.cncf.io/v1/network-attachment-definitions?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection timed out 21252ms (16:44:02.369)
Trace[866754195]: [21.252563162s] [21.252563162s] END
E1125 16:44:02.369519    4366 reflector.go:147] k8s.io/client-go/informers/factory.go:159: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dkubevirt1-01&limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection timed out
E1125 16:44:02.369521    4366 reflector.go:147] github.com/k8snetworkplumbingwg/network-attachment-definition-client/pkg/client/informers/externalversions/factory.go:117: Failed to watch *v1.NetworkAttachmentDefinition: failed to list *v1.NetworkAttachmentDefinition: Get "https://10.96.0.1:443/apis/k8s.cni.cncf.io/v1/network-attachment-definitions?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection timed out
2024-11-25T16:44:03Z [verbose] API readiness check
2024-11-25T16:44:03Z [verbose] API readiness check done!
2024-11-25T16:44:03Z [verbose] Generated MultusCNI config: {"cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","type":"multus-shim"}
2024-11-25T16:44:03Z [verbose] started to watch file /host/etc/cni/net.d/05-cilium.conflist

Container statuses of VM:

  Warning  FailedCreatePodSandBox  39s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2abdfc9c59747f33d233f9e46e5c7423f4cf4973f295198e67de801199ebc528": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"2abdfc9c59747f33d233f9e46e5c7423f4cf4973f295198e67de801199ebc528" Netns:"/var/run/netns/cni-d90b1e48-d6d2-305c-4b72-389daf47ce99" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=virtual-machines;K8S_POD_NAME=virt-launcher-ubuntu-nn5hj;K8S_POD_INFRA_CONTAINER_ID=2abdfc9c59747f33d233f9e46e5c7423f4cf4973f295198e67de801199ebc528;K8S_POD_UID=608c92b4-3a0a-4224-a746-bb3a077519b3" Path:"" ERRORED: error configuring pod [virtual-machines/virt-launcher-ubuntu-nn5hj] networking: Multus: [virtual-machines/virt-launcher-ubuntu-nn5hj/608c92b4-3a0a-4224-a746-bb3a077519b3]: error waiting for pod: pod "virt-launcher-ubuntu-nn5hj" not found
': StdinData: {"clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
  Warning  FailedCreatePodSandBox  21s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "315940a94105f5b6818a771fd0176aaa19e01b7be2cbc1bcebbe597d5664e154": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"315940a94105f5b6818a771fd0176aaa19e01b7be2cbc1bcebbe597d5664e154" Netns:"/var/run/netns/cni-77180a48-a948-d6fb-2acd-fc5772c37526" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=virtual-machines;K8S_POD_NAME=virt-launcher-ubuntu-nn5hj;K8S_POD_INFRA_CONTAINER_ID=315940a94105f5b6818a771fd0176aaa19e01b7be2cbc1bcebbe597d5664e154;K8S_POD_UID=608c92b4-3a0a-4224-a746-bb3a077519b3" Path:"" ERRORED: error configuring pod [virtual-machines/virt-launcher-ubuntu-nn5hj] networking: Multus: [virtual-machines/virt-launcher-ubuntu-nn5hj/608c92b4-3a0a-4224-a746-bb3a077519b3]: error waiting for pod: pod "virt-launcher-ubuntu-nn5hj" not found
': StdinData: {"clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
  Warning  FailedCreatePodSandBox  8s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "7e5353483e6ebb121dc639acb7145248d8e7f713b7298c301be8592420e7ec1f": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"7e5353483e6ebb121dc639acb7145248d8e7f713b7298c301be8592420e7ec1f" Netns:"/var/run/netns/cni-6baff57f-f2d0-714d-762b-bd73161eab3b" IfName:"eth0" Args:"K8S_POD_INFRA_CONTAINER_ID=7e5353483e6ebb121dc639acb7145248d8e7f713b7298c301be8592420e7ec1f;K8S_POD_UID=608c92b4-3a0a-4224-a746-bb3a077519b3;IgnoreUnknown=1;K8S_POD_NAMESPACE=virtual-machines;K8S_POD_NAME=virt-launcher-ubuntu-nn5hj" Path:"" ERRORED: error configuring pod [virtual-machines/virt-launcher-ubuntu-nn5hj] networking: Multus: [virtual-machines/virt-launcher-ubuntu-nn5hj/608c92b4-3a0a-4224-a746-bb3a077519b3]: error waiting for pod: pod "virt-launcher-ubuntu-nn5hj" not found
': StdinData: {"clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
@dougbtv
Copy link
Member

dougbtv commented Dec 5, 2024

@EdwardCooke -- make sure you open an issue on the kubevirt side as well if you haven't! They'll probably be interested.

We're also interested but it's a situation that's hard to both verify and to build tests for, so, we'd definitely like any contributions in that regard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants