Skip to content

Commit 6df8240

Browse files
committed
update workload autoscaler docs
Signed-off-by: Vacant2333 <[email protected]>
1 parent 6500fb8 commit 6df8240

File tree

4 files changed

+70
-19
lines changed

4 files changed

+70
-19
lines changed

src/content/guide/workload_autoscaler/autoscaling_policy.mdx

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ title: AutoscalingPolicy
77
**AutoscalingPolicy** defines **which Workloads** should have their **Requests and Limits** automatically adjusted, **when** these adjustments should occur, and **how** they should be applied.
88
By properly configuring a AutoscalingPolicy, you can continuously adjust the Requests of a group of Workloads to a reasonable and efficient level.
99

10-
> **Note:** Fields marked with * are required.
10+
> **Note:** If a Pod contains sidecar containers (e.g., Istio), we won’t modify them, and they will be excluded from recommendation calculations. We detect sidecars by diffing the container names between the workload’s Pod template and the actual Pod; any names that exist only in the Pod are treated as injected sidecars.
1111
1212
## Enable*
1313

@@ -49,6 +49,19 @@ You can configure multiple TargetRefs to cover a broader set of Workloads.
4949
| **Name** | Any valid Workload name \| *empty* | No | Name of the Workload. If left empty, it matches **all Workloads** within the namespace or cluster (depending on `Namespace`). |
5050
| **Namespace** | Any valid namespace \| *empty* | No | Namespace of the Workload. If left empty, it matches **all namespaces** in the cluster. |
5151

52+
**Name and Namespace support shell-style glob patterns** (*, ?, and character classes like [a-z]); patterns match the entire value, and an empty field (or *) matches all.
53+
54+
| Pattern | Meaning | Matches | Doesn’t match |
55+
|-------------------|-------------------------------------|--------------------------|--------------------|
56+
| `*` | Any value | `web`, `ns-1`, `default` ||
57+
| `web-*` | Values starting with `web-` | `web-1`, `web-prod-a` | `api-web-1` |
58+
| `*-prod` | Values ending with `-prod` | `core-prod`, `a-prod` | `prod-core` |
59+
| `front?` | `front` + exactly 1 char | `front1`, `fronta` | `front10`, `front` |
60+
| `job-??` | `job-` + exactly 2 chars | `job-01`, `job-ab` | `job-1`, `job-001` |
61+
| `ns-[0-9][0-9]-*` | `ns-` + two digits + `-` + anything | `ns-01-a`, `ns-99-x` | `ns-1-a` |
62+
| `db[0-2]` | `db0`, `db1`, or `db2` only | `db0`, `db2` | `db3`, `db-2` |
63+
| `[^0-9]*` | Does **not** start with a digit | `app1`, `ns-x` | `9-app` |
64+
5265
## Update Schedule
5366

5467
**UpdateSchedule** defines **when** a Workload should use a particular update mode.
@@ -88,6 +101,8 @@ You can visit [here](https://crontab.cronhub.io/) to refer to how the Cron synta
88101

89102
When the `UpdateMode` is set to either `ReCreate` or `InPlace`, the `OnCreate` mode will also be applied automatically. This ensures that when a Pod restarts normally, the newly created Pod will always receive the latest recommendations, regardless of the Drift Thresholds.
90103

104+
For `ReCreate` operations, when attempting to evict a **single-replica** Deployment **without PVCs**, we perform a **rolling update** to avoid service interruption during the update.
105+
91106
> **Note:** The `InPlace` mode has certain limitations and may automatically fall back to `ReCreate` in some cases. For details, see [InPlace Limitations](./best_practices_and_limitations#inplace-update-mode-limitations).
92107
93108
## Update Resources*
@@ -101,9 +116,9 @@ Available resources: `CPU` / `Memory`.
101116
- Only the selected resources will be actively updated.
102117
- This setting does **not** affect how recommendations are calculated.
103118

104-
If you don’t have specific requirements or if you already use HPA, we recommend allowing **both CPU and Memory** to be managed.
119+
If you don’t have specific requirements or if you already use `HPA`, we recommend allowing **both `CPU` and `Memory`** to be managed.
105120

106-
> **Note:** After optimizations have been applied, changing Update Resources will not roll back modifications that are already in effect. By default, we do not recommend updating this field. Instead, create a new AutoscalingPolicy and gradually replace the existing configuration.
121+
> **Note:** When you modify the `Update Resources`, an update operation may be triggered based on the deviation between the recommended value and the current value. This operation will take effect immediately once the conditions of the `Update Schedule` are met.
107122
108123
## Drift Thresholds
109124

@@ -132,14 +147,23 @@ If the deviation for **any resource** in a Pod exceeds the threshold, the Pod wi
132147
| `ReCreate` | Roll back to the pre-policy **requests** by **recreating** target Workloads (rolling replace). | Restarts, brief downtime | Cluster does not support in-place vertical changes; require scheduler to reassign resources. | Ensure safe rolling strategy. **Limits** typically remain unchanged unless your controller handles them. |
133148
| `InPlace` | Roll back to the pre-policy **requests** via **in-place** Pod updates (no recreate). | Usually zero/low disruption | Cluster supports in-place vertical resizing; prioritize minimal disturbance. | Requires cluster/runtime support for in-place updates. **Limits** unchanged unless otherwise implemented. |
134149

150+
For `ReCreate` operations, when attempting to evict a **single-replica** Deployment **without PVCs**, we perform a **rolling update** to avoid service interruption during the update.
151+
135152
> **Note:** The `InPlace` mode has certain limitations and may automatically fall back to `ReCreate` in some cases. For details, see [InPlace Limitations](./best_practices_and_limitations#inplace-update-mode-limitations). When unexpected situations prevent us from restoring the Pod Request for 10 minutes, we will allow the configuration to be deleted directly without restoring the Pod Request.
136153
137154
## Limit Policy*
138155

139156
**LimitPolicy** defines how Pod limits should be reset.
157+
By default, **we recommend using `RemoveLimit`** to ensure that a Workload can occasionally preempt more resources when needed.
158+
159+
When using Multiplier, we suggest setting a reasonable lower bound for `CPU`/`Memory` recommendations. In rare cases (e.g., in testing environments where actual usage is extremely low), the recommended values may not be sufficient for stable Pod startup or handling sudden traffic spikes.
140160

141161
| Field | Behavior |
142162
|---------------|----------------------------------------------------------------------------|
143163
| `RemoveLimit` | Remove Pod `limits` (no CPU/Memory caps). |
144164
| `KeepLimit` | Keep existing Pod `limits` unchanged. |
145165
| `Multiplier` | Recalculate `limits` by multiplying with the recommendation request value. |
166+
167+
When you modify the `Limit Policy`, an update operation may be triggered. This decision is based on the deviation between the current values and the recommended values, as well as whether existing Pods have their limits set according to the expected configuration. Once the conditions of the `Update Schedule` are met, the update will take effect immediately.
168+
169+
> **Note:** When using `KeepLimit`, the final recommended values will never exceed your configured Limits. If you want Pods to be able to use more resources in certain cases, consider using `RemoveLimit` or `Multiplier` instead.

src/content/guide/workload_autoscaler/best_practices_and_limitations.mdx

Lines changed: 26 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -12,28 +12,40 @@ This document describes best practices for the Workload Autoscaler and the limit
1212

1313
Your Kubernetes cluster version must be **1.33 or higher**.
1414

15-
### Memory limits (decrease vs. increase)
15+
### Memory Limits: Decrease vs Increase
1616

17-
**Decreasing Memory Limit is not allowed in place.**
18-
In InPlace mode, the Workload Autoscaler will **not** proactively reduce a Pod’s Memory Limit. Memory Limits are only reassigned when the Pod is **recreated normally**.
17+
#### 🔻 Decreasing Memory Limit
1918

20-
**Increasing Memory Limit may require container restarts.**
21-
If a workload (e.g., **Java** applications) cannot dynamically adapt to Memory Limit changes, configure the container’s **`ResizePolicy`** so that the **memory** resource is set to **`RestartContainer`**. Attempts to increase the Memory Limit will then automatically **restart** the corresponding container to apply the new limit.
19+
- **Not supported in `InPlace`.**
20+
`InPlace` resizing does not allow lowering the Memory limit.
21+
22+
- **Fallback behavior:**
23+
When a new recommendation would reduce an existing Pod’s Memory limit, the Workload Autoscaler automatically falls back to `ReCreate` mode and recreates the Pod.
24+
25+
#### 🔺 Increasing Memory Limit
26+
27+
- **May require container restarts.**
28+
Some workloads (e.g., **Java applications**) cannot dynamically adapt to Memory limit changes.
29+
30+
- **Best practice:**
31+
Configure the container’s **`ResizePolicy`** so that the **Memory** resource is set to **`RestartContainer`**.
32+
In this case, attempts to increase the Memory limit will automatically **restart the container** to apply the new limit.
33+
34+
#### Notes
2235

23-
**Notes**
2436
- By default, the Workload Autoscaler sets the **ResizePolicy** for **all resources** of **all containers** to **`NotRequired`**.
2537
- If you have **manually configured** a container’s ResizePolicy for any resource, the Workload Autoscaler **will not overwrite** it. For details, see the
2638
[Kubernetes documentation example](https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/#example-1-resizing-cpu-without-restart).
2739

2840
### Pod QoS class must not change
2941

30-
A Pod’s QoS class is determined at creation time (one of **Guaranteed**, **Burstable**, or **BestEffort**). InPlace updates must **not** cause a change in QoS class:
42+
A Pod’s QoS class is determined at creation time (one of **Guaranteed**, **Burstable**, or **BestEffort**). `InPlace` updates must **not** cause a change in QoS class:
3143

32-
- **BestEffort Pods** (no CPU/memory requests or limits at startup): You **cannot** add any CPU/memory requests or limits, because adding requests would convert the Pod to **Burstable**, which is **not allowed** in InPlace updates. Therefore, BestEffort Pods **cannot** use in-place vertical scaling. If you need scaling, specify requests at creation time so the Pod is at least Burstable.
33-
- **Guaranteed Pods** (for every container, CPU and memory **requests equal limits**): After InPlace adjustments, each container must still satisfy **`requests == limits`**. To increase or decrease CPU/memory, you must update **both** request **and** limit to the **same** value. For example, going from 2 CPU to 3 CPU requires setting **both** request and limit to 3. You cannot change only one of them, or the Pod will no longer be Guaranteed.
34-
- **Burstable Pods** (have requests, but not all equal to limits, or some containers may have no requests): You may adjust CPU/memory, but **must not** turn the Pod into Guaranteed. It is forbidden to make **both CPU and memory requests equal to their limits** across all containers after the update; otherwise the Pod would become Guaranteed. You also must not clear all requests and turn the Pod into BestEffort. In short, the Pod **must keep its original QoS class** unchanged.
44+
- **BestEffort Pods** (no CPU/Memory requests or limits at startup): You **cannot** add any CPU/Memory requests or limits, because adding requests would convert the Pod to **Burstable**, which is **not allowed** in `InPlace` updates. Therefore, BestEffort Pods **cannot** use in-place vertical scaling. If you need scaling, specify requests at creation time so the Pod is at least Burstable.
45+
- **Guaranteed Pods** (for every container, CPU and Memory **requests equal limits**): After `InPlace` adjustments, each container must still satisfy **`requests == limits`**. To increase or decrease CPU/Memory, you must update **both** request **and** limit to the **same** value. For example, going from 2 CPU to 3 CPU requires setting **both** request and limit to 3. You cannot change only one of them, or the Pod will no longer be Guaranteed.
46+
- **Burstable Pods** (have requests, but not all equal to limits, or some containers may have no requests): You may adjust CPU/Memory, but **must not** turn the Pod into Guaranteed. It is forbidden to make **both CPU and Memory requests equal to their limits** across all containers after the update; otherwise the Pod would become Guaranteed. You also must not clear all requests and turn the Pod into BestEffort. In short, the Pod **must keep its original QoS class** unchanged.
3547

36-
If an InPlace operation would violate any of the above QoS rules, the Workload Autoscaler **falls back to `ReCreate` mode** and explicitly recreates (re-schedules) the target Pod.
48+
If an `InPlace` operation violates any of the above QoS rules, the Workload Autoscaler **falls back to `ReCreate` mode** and explicitly recreates (re-schedules) the target Pod.
3749

3850
> **Note:** Such fallback events are expected to occur only when a Workload is first configured with a AutoscalingPolicy or when certain related configurations of the AutoscalingPolicy are modified. They should not occur during normal operation.
3951
@@ -45,7 +57,7 @@ In this scenario, the Workload Autoscaler will **fall back to `ReCreate` mode**
4557

4658
### Coexisting with HPA
4759

48-
Using the Workload Autoscaler **together** with **HPA (Horizontal Pod Autoscaler)** can produce unexpected behavior. If you need both, configure them to manage **different resources**—for example, let HPA scale by **CPU usage**, while the Workload Autoscaler adjusts only **memory**.
60+
Using the Workload Autoscaler **together** with **HPA (Horizontal Pod Autoscaler)** can produce unexpected behavior. If you need both, configure them to manage **different resources**—for example, let HPA scale by **CPU usage**, while the Workload Autoscaler adjusts only **Memory**.
4961

5062
## Best Practices
5163

@@ -57,6 +69,6 @@ Whenever possible, set **resource requests** for every container in all workload
5769

5870
Avoid specifying **limits** whenever feasible. Instead, set **requests** to place Pods in the **Burstable** QoS class.
5971

60-
### Set a restart policy for workloads that cannot adapt memory InPlace
72+
### Set a restart policy for workloads that cannot adapt Memory InPlace
6173

62-
For workloads like **Java** that cannot adjust to Memory Limit changes dynamically, manually configure the container’s **`ResizePolicy`** so that when the InPlace update modifies the Memory Limit, the **container will restart** to apply the new limit (set memory ResizePolicy to **`RestartContainer`**).
74+
For workloads like **Java** that cannot adjust to Memory Limit changes dynamically, manually configure the container’s **`ResizePolicy`** so that when the `InPlace` update modifies the Memory Limit, the **container will restart** to apply the new limit (set Memory ResizePolicy to **`RestartContainer`**).

src/content/guide/workload_autoscaler/installation.mdx

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,3 +74,17 @@ Similarly, the `Workload Autoscaler` component is uninstalled together with the
7474
to uninstall the `Workload Autoscaler` component independently, please contact our technical support team for assistance.
7575

7676
> **Note:** Before uninstalling the Workload Autoscaler, please make sure that all AutoscalingPolicies have been deleted or disabled, and confirm that all Workloads have been restored to their original state.
77+
78+
## Configure the Update/Evict Limiter
79+
80+
By default, the **Workload Autoscaler** enables a **Limiter** that throttles the number of **in-place updates** and **Pod evictions**. This helps prevent large clusters from becoming unstable when many Pods are updated or evicted in a short period.
81+
82+
You can tune the Limiter with the environment variables below. If not set, the defaults apply.
83+
84+
| ENV var | Default | What it controls |
85+
|----------------------------|---------|--------------------------------------------------------------------------------|
86+
| `LIMITER_QUOTA_PER_WINDOW` | **5** | Tokens added to the bucket each window. |
87+
| `LIMITER_BURST` | **10** | Maximum tokens allowed in the bucket (peak operations within a window). |
88+
| `LIMITER_WINDOW_SECONDS` | **30** | Window length in seconds; every window adds `LIMITER_QUOTA_PER_WINDOW` tokens. |
89+
90+
> **Note:** For eviction operations, when attempting to evict a **single-replica** Deployment **without PVCs**, we perform a **rolling update** to avoid service interruption during the update.

src/content/guide/workload_autoscaler/recommendation_policy.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ It allows you to define the range of recommendation values, enabling more flexib
99

1010
This document explains the meaning and valid range of each field in the `Recommendation Policy`.
1111

12-
> **Note:** Fields marked with * are required.
12+
> **Note:** For all containers, if the recommended values are below the minimums, the system automatically raises them to: CPU `20m` and Memory `20Mi`. This ensures that resource requests never fall below safe operational thresholds.
1313
1414
## Strategy Type*
1515

@@ -86,7 +86,6 @@ For critical workloads, you can set it to 7 days to ensure recommendations accou
8686
![resource_limits](./img/recommendation_policy/resource_limits.png)
8787

8888
You can set both `Min` and `Max` limits for `CPU` and `Memory`. This ensures that the recommended values will not fall below or exceed the range you define.
89-
9089
The `Max limit` is applied after the `Buffer` is calculated, meaning the final recommended value (including the `Buffer`) will not exceed the `Max limit`.
9190

9291
For `Resource Limits`, you can use either percentages or absolute values:
@@ -100,6 +99,8 @@ In most cases, we recommend ***using percentages so the system can adjust based
10099
For example, if you set `CPU` to `30%` ~ `200%`, the final recommended value will never be lower than `30%` of the original Request,
101100
nor higher than `200%` of the original Request.
102101

102+
We strongly recommend that you configure Min limits for both `CPU` and `Memory` resources to prevent recommended values from being too low in certain cases, which could cause Pods to fail to run properly.
103+
103104
> **Note:** When using percentages for `Resource Limits`, you must ensure that all containers within the workloads governed by this `Recommendation Policy` have defined Request values for the corresponding resource. Otherwise, the system will not be able to calculate a recommendation.
104105
105106
## Evaluation Period*

0 commit comments

Comments
 (0)