crik is a project that aims to provide checkpoint and restore functionality for Kubernetes pods mainly targeted for
node shutdown and restart scenarios. It is a command wrapper that, under the hood, utilizes
criu to checkpoint and restore process trees in a Pod.
crikis first revealed at KubeCon EU 2024: The Party Must Go on - Resume Pods After Spot Instance Shut Down - Muvaffak Onuş, QA Wolf
It is a work in progress and is not ready for production use.
crik has two components:
crik- a command wrapper that executes given command and checkpoints it when SIGTERM is received and restores from checkpoint when image directory contains a checkpoint.manager- a kubernetes controller that watchesNodeobjects and updates its internal map of states so thatcrikcan check whether it should checkpoint or restore depending on its node's state.
The only pre-requisite is to have a Kubernetes cluster running. You can use kind to create a local cluster.
kind create clusterThen, you can deploy the simple-loop example where a counter increases every second and you can delete the pod and see that it continues from where it left off in the new pod.
kubectl apply -f examples/simple-loop.yamlWatch logs:
kubectl logs -f simple-loop-0In another terminal, delete the pod:
kubectl delete pod simple-loop-0Now, a new pod is created. See that it continues from where it left off:
kubectl logs -f simple-loop-0The application you want to checkpoint and restore should be run with crik command, like the following:
crik run -- app-binaryThe following is an example Dockerfile for your application that installs crik and runs your application. It assumes
your application is entrypoint.sh.
FROM ubuntu:22.04
RUN apt-get update && apt-get install --no-install-recommends --yes gnupg curl ca-certificates
# crik requires criu to be available.
RUN curl "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x4E2A48715C45AEEC077B48169B29EEC9246B6CE2" | gpg --dearmor > /usr/share/keyrings/criu-ppa.gpg \
&& echo "deb [signed-by=/usr/share/keyrings/criu-ppa.gpg] https://ppa.launchpadcontent.net/criu/ppa/ubuntu jammy main" > /etc/apt/sources.list.d/criu.list \
&& apt-get update \
&& apt-get install --no-install-recommends --yes criu iptables
# Install crik
COPY --from=ghcr.io/qawolf/crik/crik:v0.1.2 /usr/local/bin/crik /usr/local/bin/crik
# Copy your application
COPY entrypoint.sh /entrypoint.sh
# Run your application with crik
ENTRYPOINT ["crik", "run", "--", "/entrypoint.sh"]Not all apps can be checkpointed and restored and for many of them, criu may need additional configurations. crik
provides a high level configuration interface that you can use to configure crik for your application. The following
is the minimum configuration you need to provide for your application and by default crik looks for config.yaml in
/etc/crik directory.
kind: ConfigMap
metadata:
name: crik-simple-loop
data:
config.yaml: |-
imageDir: /etc/checkpointConfiguration options:
imageDir- the directory wherecrikwill store the checkpoint images. It needs to be available in the same path in the newPodas well.additionalPaths- additional paths thatcrikwill include in the checkpoint and copy back in the newPod. Populate this list if you getfile not founderrors in the restore logs. The paths are relative to root/and can be directories or files.inotifyIncompatiblePaths- paths thatcrikwill delete before taking the checkpoint. Populate this list if you getfsnotify: Handle 0x278:0x2ffb5b cannot be openederrors in the restore logs. You need to find the inode of the file by converting0x2ffb5bto an integer, and then find the path of the file by runningfind / -inum <inode>and add the path to this list. See this comment for more details.
Alpha feature. Not ready for production use.
You can optionally configure crik to take checkpoint only if the node it's running on is going to be shut down. This is
achieved by deploying a Kubernetes controller that watches Node events and updates its internal map of states so that
crik can check whether it should checkpoint or restore depending on its node's state. This may include direct calls
to the cloud provider's API to check the node's state in the future.
Deploy the controller:
helm upgrade --install node-state-server oci://ghcr.io/qawolf/crik/charts/node-state-server --version 0.1.2Make sure to include the URL of the server in crik's configuration mounted to your Pod.
# Assuming the chart is deployed to default namespace.
kind: ConfigMap
metadata:
name: crik-simple-loop
data:
config.yaml: |-
imageDir: /etc/checkpoint
nodeStateServerURL: http://crik-node-state-server.default.svc.cluster.local:9376crik will hit the /node-state endpoint of the server to get the state of the node it's running on when it receives
SIGTERM and take checkpoint only if it returns shutting-down as the node's state. However, it needs to provide the
node name to the server so make sure to add the following environment variable to your container spec in your Pod:
env:
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeNameBuild crik:
go build -o crik cmd/crik/main.goTaking checkpoints of processes and restoring them from within the container requires quite a few privileges to be given
to the container. The best approach is to execute these operations at the container runtime level and today, container
engines such as CRI-O and Podman do have native support for using criu to checkpoint and restore the whole containers
and there is an ongoing effort to bring this functionality to Kubernetes as well. The first use case being the forensic
analysis via checkpoints as described here.
While it is the better approach, since it's such a low-level change, it's expected to take a while to be available in
mainstream Kubernetes in an easily consumable way. For example, while taking a checkpoint is possible through kubelet
API if you're using CRI-O, restoring it as another Pod in a different Node is not natively supported yet.
crik allows you to use criu to checkpoint and restore a Pod to another Node today without waiting for the native
support in Kubernetes. Once the native support is available, crik will utilize it under the hood.
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.