Skip to content

tcrawley-xilinx/kubernetes-onload

 
 

Repository files navigation

Onload Operator and Onload Device Plugin for Kubernetes and OpenShift

NOTE: The project is under development.

Deploy Onload Operator

Please find the deployment instructions below, split into three groups for clarity. They use the example.com image registry as an example.

Build Onload images

The required Onload images are: source, userland and kernel.

  1. Clone OpenOnload from https://github.com/Xilinx-CNS/onload
  2. Create an Onload source tarball:
scripts/onload_mkdist
  1. Create Onload source and userland images, to be pushed later to the image registry of your choice:
scripts/onload_mkcontainer --source example.com:5000/onload-source:latest --user example.com:5000/onload-user:latest *.tgz

By default, the Onload userland uses UBI libc. Please check and patch the Dockerfile if this is incompatible with your application's environment.

  1. Push them to the image registry:
docker push example.com:5000/onload-source:latest
docker push example.com:5000/onload-user:latest
  1. In this repository, build an Onload kernel image:
make onload-module-dtk ONLOAD_SOURCE_IMAGE_REPO=example.com:5000/onload-source ONLOAD_SOURCE_IMAGE_TAG=latest ONLOAD_MODULE_IMAGE_REPO=example.com:5000/onload-module

Please note that the onload-module-dtk target is currently tailored to OpenShift. Please edit build/onload-module/Makefile to accommodate non-OpenShift kernels.

  1. Push the Onload kernel image (copied the autogenerated hash and kernel version):
docker push example.com:5000/onload-module:deccdb8d036a4c7794dea2cda106b3c112b374a9-4.18.0-372.49.1.el8_6.x86_64

Prepare Kubernetes cluster

The Onload Operator v3 depends on third-party software:

  1. Kernel Module Manamagent (KMM) Operator v1.1.1.
  2. Multus CNI. A sample configuration using macvlan is https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/examples/macvlan-pod.yml.

Please note that NetworkAttachmentDefinition is used later in the definition of Onloaded applications.

Build and deploy Onload Operator v3

Please install Go 1.21+ in the machines where you run make and follow these instructions in this repository:

  1. Create and push the Onload Operator controller image:
make docker-build docker-push IMG=example.com:5000/operator:latest
  1. Create and push the Onload Device Plugin image:
make device-plugin-docker-build device-plugin-docker-push DEVICE_IMG=example.com:5000/deviceplugin:latest
  1. Deploy the Onload Operator v3:
make deploy IMG=example.com:5000/operator:latest
  1. Patch the Onload CR sample accordingly:
diff --git a/config/samples/onload_v1alpha1_onload.yaml b/config/samples/onload_v1alpha1_onload.yaml
index 0ffba0c..2768bdc 100644
--- a/config/samples/onload_v1alpha1_onload.yaml
+++ b/config/samples/onload_v1alpha1_onload.yaml
@@ -55,9 +55,9 @@ spec:
     # Example image locations using openshift local image registry.
     kernelMappings:
       - regexp: '^.*\.x86_64$'
-        kernelModuleImage: image-registry.openshift-image-registry.svc:5000/onload-clusterlocal/onload-module:v8.1.0-${KERNEL_FULL_VERSION}
+        kernelModuleImage: example.com:5000/onload-module:deccdb8d036a4c7794dea2cda106b3c112b374a9-4.18.0-372.49.1.el8_6.x86_64
         sfc: {}
-    userImage: image-registry.openshift-image-registry.svc:5000/onload-clusterlocal/onload-user:v8.1.0
+    userImage: example.com:5000/onload-user:latest
     version: 8.1.0
     imagePullPolicy: Always
   devicePlugin:

Please also make sure devicePluginImage is correct. Another important field is the node selector, which tells the Onload Operator where to deploy the Onload Device Plugin DaemonSet and Module kind (KMM).

  1. Finally, deploy the Onload CR:
oc apply -k config/samples/

Run Onloaded application

At this point, Onload is deployed at the cluster, and the users can run Onloaded applications, e.g.

apiVersion: v1
kind: Pod
metadata:
  name: test
  annotations:
    k8s.v1.cni.cncf.io/networks: ipvlan-sf0
spec:
  restartPolicy: Never
  securityContext:
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: test
    image: test:latest
    imagePullPolicy: Always
    command:
    - /test
    resources:
      limits:
        amd.com/onload: 1
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
  nodeName: compute-0

There are two key fields in the above example CR:

  1. metadata.annotations adds the acceleratable interfaces to the pods.
  2. spec.containers[].resources.limits injects Onload, i.e. devfs and *.so files, and also sets LD_PRELOAD.

No further modifications are required to enable Onloaded applications.


Copyright (c) 2023 Advanced Micro Devices, Inc.

About

Onload Integration for Kubernetes and Openshift

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 81.0%
  • Makefile 13.0%
  • Dockerfile 5.3%
  • Shell 0.7%