Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
418 changes: 11 additions & 407 deletions README.md

Large diffs are not rendered by default.

10 changes: 10 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# zeropod Documentation

* [Getting Started](./getting_started.md)
* [Configuration](./configuration/README.md)
* [Migration](./configuration/migration.md)
* [Grouping](./configuration/grouping.md)
* [Architecture](./architecture/README.md)
* [Node](./architecture/node.md)
* [Development](./development.md)
* [Metrics](./metrics.md)
3 changes: 3 additions & 0 deletions docs/architecture/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Architecture

This directory documents the architecture of zeropod.
88 changes: 88 additions & 0 deletions docs/architecture/node.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Node

The zeropod-node Daemonset is scheduled on every node labelled
`zeropod.ctrox.dev/node=true`. The individual components of the node daemon
are documented in this section.

## Installer

The installer runs as an init-container and runs the binary
`cmd/installer/main.go` with some distro-specific options to install the
runtime binaries, configure containerd and register the `RuntimeClass`.

## Manager

The manager component starts after the installer init-container has succeeded.
It provides functionality that is needed on a node-level and is would bloat
the shim otherwise. For example, loading eBPF programs can be quite memory
intensive so they have been moved from the shim to the manager to keep the
shim memory usage as minimal as possible.

These are the responsibilities of the manager:

- Loading eBPF programs that the shim(s) rely on.
- Collect metrics from all shim processes and expose them on HTTP for scraping.
- Subscribes to shim scaling events and adjusts Pod requests.

### In-place Resource scaling

This makes use of the feature flag
[InPlacePodVerticalScaling](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources)
to automatically update the pod resource requests to a minimum on scale down
events and revert them again on scale up. Once the Kubernetes feature flag is
enabled, it also needs to be enabled using the manager flag
`-in-place-scaling=true` plus some additional permissions are required for the
node driver to patch pods. To deploy this, simply uncomment the
`in-place-scaling` component in the `config/production/kustomization.yaml`.
This will add the flag and the required permissions when building the
kustomization.

### Status Labels

To reflect the container scaling status in the k8s API, the manager can set
status labels on a pod. This requires the flag `-status-labels=true`, which is
set by default in the production deployment.

The resulting labels have the following structure:

```yaml
status.zeropod.ctrox.dev/<container name>: <container status>
```

So if our pod has two containers, one of them running and one in scaled-down
state, the labels would be set like this:

```yaml
labels:
status.zeropod.ctrox.dev/container1: RUNNING
status.zeropod.ctrox.dev/container2: SCALED_DOWN
```

### Status Events

The manager can also be configured to emit Kubernetes events on scaling events
of a pod. This requires the flag `-status-events=true`, which is set by default
in the production deployment.

### All Flags

```
-debug
enable debug logs
-in-place-scaling
enable in-place resource scaling, requires InPlacePodVerticalScaling feature flag
-kubeconfig string
Paths to a kubeconfig. Only required if out-of-cluster.
-metrics-addr string
address of the metrics server (default ":8080")
-node-server-addr string
address of the node server (default ":8090")
-probe-binary-name string
set the probe binary name for probe detection (default "kubelet")
-status-events
create status events to reflect container status
-status-labels
update pod labels to reflect container status
-version
output version and exit
```
144 changes: 144 additions & 0 deletions docs/configuration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# Configuration

A pod can make use of zeropod only if the `runtimeClassName` is set to
`zeropod`. See this minimal example of a pod:

```yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
app: nginx
spec:
runtimeClassName: zeropod
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
```

## Probes

Zeropod is able to intercept liveness probes while the container process is
scaled down to ensure the application is not restored for probes. This just
works for HTTP and TCP probes, GRPC and exec probes will wake the container up.

```yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
annotations:
zeropod.ctrox.dev/scaledown-duration: 10s
spec:
runtimeClassName: zeropod
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
livenessProbe:
httpGet:
port: 80
```

In this example, the container will be scaled down 10 seconds after starting
even though we have defined a probe. Zeropod will take care of replying to the
probe when the container is scaled down. Whenever the container is running, the
probe traffic will be forwarded to the app just like normal traffic. You can
also customize the path and the headers of the probe, just be mindful of the
size of those. To reduce memory usage, by default, zeropod will only read the
first `1024` bytes of each request to detect an HTTP probe. If the probe is
larger than that, traffic will just be passed through and the app will be
restored on each probe request. In that case, it can be increased with the
[probe buffer size](#zeropodctroxdevprobe-buffer-size) annotation.

## Annotations

The behaviour of zeropod can be adjusted with a number of pod annotations.

### Container Names

```yaml
zeropod.ctrox.dev/container-names: "nginx,sidecar"
```

A comma-separated list of container-names in the pod that should be considered
for scaling to zero. If unset or empty, all containers will be considered.

### Ports Map

```yaml
zeropod.ctrox.dev/ports-map: "nginx=80,81;sidecar=8080"
```

Ports-map configures the ports our to be scaled down application(s) are
listening on. As ports have to be matched with containers in a pod, the key is
the container name and the value a comma-delimited list of ports any TCP
connection on one of these ports will restore an application. If this annotation
is not specified, zeropod will try to find the listening ports automatically.
Use this option in case this fails for your application.

### Scale Down Duration

```yaml
zeropod.ctrox.dev/scaledown-duration: 10s
```

Configures how long to wait before scaling down again after the last connection.
The duration is reset whenever a new connection is detected. Setting it to 0
disables scaling down. If unset it defaults to 1 minute.

### Pre-dump

```yaml
zeropod.ctrox.dev/pre-dump: "true"
```

Execute a pre-dump before the full checkpoint and process stop. This can reduce
the checkpoint time in some cases but testing has shown that it also has a small
impact on restore time so YMMV. The default is false. See [the CRIU
docs](https://criu.org/Memory_changes_tracking) for details on what this does.

### Disable Checkpointing

```yaml
zeropod.ctrox.dev/disable-checkpointing: "true"
```

Disable checkpointing completely when scaling down. This option was introduced
for testing purposes to measure how fast some applications can be restored from
a complete restart instead of from memory images. If enabled, the process will
be killed on scale-down and all state is lost. This might be useful for some
use-cases where the application is stateless and super fast to startup.

### Disable Probe Detection

```yaml
zeropod.ctrox.dev/disable-probe-detection: "true"
```

Disables the [probe detection mechanism](#probes). If there are probes defined
on a container, they will be forwarded to the container just like any traffic
and will wake it up.


### Probe Buffer Size

```yaml
zeropod.ctrox.dev/probe-buffer-size: "1024"
```

Configure the buffer size of the probe detector. To be able to detect an HTTP
liveness/readiness probe, zeropod needs to read a certain amount of bytes from
the TCP stream of incoming connections. This normally does not need to be
adjusted as the default should fit most probes and only needs to be increased in
case the probe contains lots of header data. Defaults to `1024` if unset.

## Experimental Features

Features that are marked as experimental might change form in the future or
could be removed entirely in future releases depending on the stability and
need.
52 changes: 52 additions & 0 deletions docs/configuration/diagrams/live-migration.drawio
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
<mxfile host="app.diagrams.net" agent="Mozilla/5.0 (X11; Linux i686; rv:132.0) Gecko/20100101 Firefox/132.0" version="26.0.16">
<diagram name="Page-1" id="S3Romfl9XJvJNEUoPPOx">
<mxGraphModel dx="1418" dy="1131" grid="0" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="0" pageScale="1" pageWidth="700" pageHeight="500" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="nSVk_hhTxszxVaa_JyYo-2" value="Node a" style="swimlane;whiteSpace=wrap;html=1;labelBackgroundColor=none;fillColor=#A8DADC;strokeColor=#457B9D;fontColor=#1D3557;rounded=1;" parent="1" vertex="1">
<mxGeometry x="-242" y="-23" width="200" height="265" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-1" value="deleted Pod" style="rounded=1;whiteSpace=wrap;html=1;labelBackgroundColor=none;fillColor=#A8DADC;strokeColor=#457B9D;fontColor=#1D3557;" parent="nSVk_hhTxszxVaa_JyYo-2" vertex="1">
<mxGeometry x="40" y="61" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-13" value="&lt;span style=&quot;background-color: light-dark(#ffffff, var(--ge-dark-color, #121212));&quot;&gt;shim&lt;/span&gt;" style="edgeStyle=none;curved=1;rounded=1;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=1;entryDx=0;entryDy=0;fontSize=12;startSize=8;endSize=8;endArrow=none;startFill=0;labelBackgroundColor=none;strokeColor=#457B9D;fontColor=default;" parent="nSVk_hhTxszxVaa_JyYo-2" source="nSVk_hhTxszxVaa_JyYo-5" target="nSVk_hhTxszxVaa_JyYo-1" edge="1">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-5" value="zeropod-node" style="rounded=1;whiteSpace=wrap;html=1;labelBackgroundColor=none;fillColor=#A8DADC;strokeColor=#457B9D;fontColor=#1D3557;" parent="nSVk_hhTxszxVaa_JyYo-2" vertex="1">
<mxGeometry x="40" y="157" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-6" value="Node b" style="swimlane;whiteSpace=wrap;html=1;labelBackgroundColor=none;fillColor=#A8DADC;strokeColor=#457B9D;fontColor=#1D3557;rounded=1;" parent="1" vertex="1">
<mxGeometry x="190" y="-23" width="200" height="265" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-14" value="&lt;span style=&quot;background-color: light-dark(#ffffff, var(--ge-dark-color, #121212));&quot;&gt;shim&lt;/span&gt;" style="edgeStyle=none;curved=1;rounded=1;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;fontSize=12;startSize=8;endSize=8;endArrow=none;startFill=0;labelBackgroundColor=none;strokeColor=#457B9D;fontColor=default;" parent="nSVk_hhTxszxVaa_JyYo-6" source="nSVk_hhTxszxVaa_JyYo-7" target="nSVk_hhTxszxVaa_JyYo-8" edge="1">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-7" value="fresh Pod" style="rounded=1;whiteSpace=wrap;html=1;labelBackgroundColor=none;fillColor=#A8DADC;strokeColor=#457B9D;fontColor=#1D3557;" parent="nSVk_hhTxszxVaa_JyYo-6" vertex="1">
<mxGeometry x="40" y="61" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-8" value="zeropod-node" style="rounded=1;whiteSpace=wrap;html=1;labelBackgroundColor=none;fillColor=#A8DADC;strokeColor=#457B9D;fontColor=#1D3557;" parent="nSVk_hhTxszxVaa_JyYo-6" vertex="1">
<mxGeometry x="40" y="157" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-11" value="Migration CR" style="rounded=1;whiteSpace=wrap;html=1;labelBackgroundColor=none;fillColor=#A8DADC;strokeColor=#457B9D;fontColor=#1D3557;" parent="1" vertex="1">
<mxGeometry x="15" y="256" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-12" value="&lt;span style=&quot;background-color: light-dark(#ffffff, var(--ge-dark-color, #121212));&quot;&gt;creates&lt;/span&gt;" style="edgeStyle=orthogonalEdgeStyle;rounded=1;orthogonalLoop=1;jettySize=auto;html=1;entryX=0;entryY=0.5;entryDx=0;entryDy=0;fontSize=12;startSize=8;endSize=8;exitX=0.5;exitY=1;exitDx=0;exitDy=0;labelBackgroundColor=none;strokeColor=#457B9D;fontColor=default;curved=0;" parent="1" source="nSVk_hhTxszxVaa_JyYo-5" target="nSVk_hhTxszxVaa_JyYo-11" edge="1">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-16" value="&lt;span style=&quot;background-color: light-dark(#ffffff, var(--ge-dark-color, #121212));&quot;&gt;claims&lt;/span&gt;" style="edgeStyle=orthogonalEdgeStyle;rounded=1;orthogonalLoop=1;jettySize=auto;html=1;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=1;entryY=0.5;entryDx=0;entryDy=0;fontSize=12;startSize=8;endSize=8;labelBackgroundColor=none;strokeColor=#457B9D;fontColor=default;curved=0;" parent="1" source="nSVk_hhTxszxVaa_JyYo-8" target="nSVk_hhTxszxVaa_JyYo-11" edge="1">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="nSVk_hhTxszxVaa_JyYo-17" value="&lt;div&gt;&lt;span style=&quot;background-color: light-dark(#ffffff, var(--ge-dark-color, #121212));&quot;&gt;checkpoint pull&lt;/span&gt;&lt;/div&gt;" style="edgeStyle=none;curved=1;rounded=1;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=0.25;exitDx=0;exitDy=0;entryX=0;entryY=0.25;entryDx=0;entryDy=0;fontSize=12;startSize=8;endSize=8;labelBackgroundColor=none;strokeColor=#457B9D;fontColor=default;" parent="1" source="nSVk_hhTxszxVaa_JyYo-5" target="nSVk_hhTxszxVaa_JyYo-8" edge="1">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="fjhrUzI2nnMYbL1CwRdW-4" value="&lt;span style=&quot;background-color: light-dark(#ffffff, var(--ge-dark-color, #121212));&quot;&gt;memory pages&lt;/span&gt;" style="edgeStyle=none;curved=1;rounded=1;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=0.75;exitDx=0;exitDy=0;entryX=0;entryY=0.75;entryDx=0;entryDy=0;fontSize=12;startSize=8;endSize=8;labelBackgroundColor=none;strokeColor=#457B9D;fontColor=default;" edge="1" parent="1" source="nSVk_hhTxszxVaa_JyYo-5" target="nSVk_hhTxszxVaa_JyYo-8">
<mxGeometry relative="1" as="geometry">
<mxPoint x="-72" y="159" as="sourcePoint" />
<mxPoint x="240" y="159" as="targetPoint" />
</mxGeometry>
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading