In this document we will explore a "hello-world"-like example on how to create and make use of an OCI protocol KServe custom storage initializer.
- Install Kind (Kubernetes in Docker) to run local Kubernetes cluster with Docker container nodes.
- Install the Kubernetes CLI (kubectl), which allows you to run commands against Kubernetes clusters.
- Install the Kustomize, which allows you to customize app configuration.
Note
You can skip this step if you want to use an existing container image
export VERSION=<replace>
Build the docker image:
make VERSION=${VERSION} image-build
This will generate a new container image like quay.io/${USER}/oci-storage-initializer:${VERSION}
Note
If testing locally using Podman, you might need to ensure the parameter --load
is passed to the docker build
command so you can later inject the local-built image inside KinD.
We assume all prerequisites are satisfied at this point.
-
After having Kind installed, create a kind cluster with:
kind create cluster
-
Configure
kubectl
to use kind contextkubectl config use-context kind-kind
-
Setup local deployment of Kserve using the provided Kserve quick installation script
curl -s "https://raw.githubusercontent.com/kserve/kserve/release-0.12/hack/quick_install.sh" | bash
-
Load the local oci-storage-initializer image in Kind
kind load docker-image quay.io/${USER}/oci-storage-initializer:${VERSION}
Skip this step if you are using a publicly available image
kubectl apply -f - <<EOF
apiVersion: "serving.kserve.io/v1alpha1"
kind: ClusterStorageContainer
metadata:
name: oci-storage-initializer
spec:
container:
name: storage-initializer
image: quay.io/$USER/oci-storage-initializer:${VERSION}
imagePullPolicy: IfNotPresent # NOT FOR PROD but allow easier testing of local images with KinD (just remove for prod)
resources:
requests:
memory: 100Mi
cpu: 100m
limits:
memory: 1Gi
cpu: "1"
supportedUriFormats:
- prefix: oci-artifact://
EOF
-
Create a user namespace
kubectl create namespace kserve-test
-
Create the
InferenceService
custom resourcekubectl apply -n kserve-test -f - <<EOF apiVersion: "serving.kserve.io/v1beta1" kind: "InferenceService" metadata: name: "sklearn-iris" spec: predictor: model: modelFormat: name: sklearn storageUri: "oci-artifact://quay.io/mmortari/demo20240606-orascsi-ociartifactrepo:latest" EOF
-
Check
InferenceService
statuskubectl get inferenceservices sklearn-iris -n kserve-test
-
Determine the ingress IP and ports
kubectl get svc istio-ingressgateway -n istio-system
And then:
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}') kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8081:80
After that (in another terminal):
export INGRESS_HOST=localhost export INGRESS_PORT=8081
-
Perform the inference request Prepare the input data:
cat <<EOF > "/tmp/iris-input.json" { "instances": [ [6.8, 2.8, 4.8, 1.4], [6.0, 3.4, 4.5, 1.6] ] } EOF
If you do not have DNS, you can still curl with the ingress gateway external IP using the HOST Header.
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -n kserve-test -o jsonpath='{.status.url}' | cut -d "/" -f 3) curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" "http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/sklearn-iris:predict" -d @/tmp/iris-input.json