You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/guide/workload_autoscaler/autoscaling_policy.mdx
+27-3Lines changed: 27 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ title: AutoscalingPolicy
7
7
**AutoscalingPolicy** defines **which Workloads** should have their **Requests and Limits** automatically adjusted, **when** these adjustments should occur, and **how** they should be applied.
8
8
By properly configuring a AutoscalingPolicy, you can continuously adjust the Requests of a group of Workloads to a reasonable and efficient level.
9
9
10
-
> **Note:**Fields marked with * are required.
10
+
> **Note:**If a Pod contains sidecar containers (e.g., Istio), we won’t modify them, and they will be excluded from recommendation calculations. We detect sidecars by diffing the container names between the workload’s Pod template and the actual Pod; any names that exist only in the Pod are treated as injected sidecars.
11
11
12
12
## Enable*
13
13
@@ -49,6 +49,19 @@ You can configure multiple TargetRefs to cover a broader set of Workloads.
49
49
|**Name**| Any valid Workload name \|*empty*| No | Name of the Workload. If left empty, it matches **all Workloads** within the namespace or cluster (depending on `Namespace`). |
50
50
|**Namespace**| Any valid namespace \|*empty*| No | Namespace of the Workload. If left empty, it matches **all namespaces** in the cluster. |
51
51
52
+
**Name and Namespace support shell-style glob patterns** (*, ?, and character classes like [a-z]); patterns match the entire value, and an empty field (or *) matches all.
|`ns-[0-9][0-9]-*`|`ns-` + two digits + `-` + anything |`ns-01-a`, `ns-99-x`|`ns-1-a`|
62
+
|`db[0-2]`|`db0`, `db1`, or `db2` only |`db0`, `db2`|`db3`, `db-2`|
63
+
|`[^0-9]*`| Does **not** start with a digit |`app1`, `ns-x`|`9-app`|
64
+
52
65
## Update Schedule
53
66
54
67
**UpdateSchedule** defines **when** a Workload should use a particular update mode.
@@ -88,6 +101,8 @@ You can visit [here](https://crontab.cronhub.io/) to refer to how the Cron synta
88
101
89
102
When the `UpdateMode` is set to either `ReCreate` or `InPlace`, the `OnCreate` mode will also be applied automatically. This ensures that when a Pod restarts normally, the newly created Pod will always receive the latest recommendations, regardless of the Drift Thresholds.
90
103
104
+
For `ReCreate` operations, when attempting to evict a **single-replica** Deployment **without PVCs**, we perform a **rolling update** to avoid service interruption during the update.
105
+
91
106
> **Note:** The `InPlace` mode has certain limitations and may automatically fall back to `ReCreate` in some cases. For details, see [InPlace Limitations](./best_practices_and_limitations#inplace-update-mode-limitations).
92
107
93
108
## Update Resources*
@@ -101,9 +116,9 @@ Available resources: `CPU` / `Memory`.
101
116
- Only the selected resources will be actively updated.
102
117
- This setting does **not** affect how recommendations are calculated.
103
118
104
-
If you don’t have specific requirements or if you already use HPA, we recommend allowing **both CPU and Memory** to be managed.
119
+
If you don’t have specific requirements or if you already use `HPA`, we recommend allowing **both `CPU` and `Memory`** to be managed.
105
120
106
-
> **Note:**After optimizations have been applied, changing Update Resources will not roll back modifications that are already in effect. By default, we do not recommend updating this field. Instead, create a new AutoscalingPolicy and gradually replace the existing configuration.
121
+
> **Note:**When you modify the `Update Resources`, an update operation may be triggered based on the deviation between the recommended value and the current value. This operation will take effect immediately once the conditions of the `Update Schedule` are met.
107
122
108
123
## Drift Thresholds
109
124
@@ -132,14 +147,23 @@ If the deviation for **any resource** in a Pod exceeds the threshold, the Pod wi
132
147
|`ReCreate`| Roll back to the pre-policy **requests** by **recreating** target Workloads (rolling replace). | Restarts, brief downtime | Cluster does not support in-place vertical changes; require scheduler to reassign resources. | Ensure safe rolling strategy. **Limits** typically remain unchanged unless your controller handles them. |
133
148
|`InPlace`| Roll back to the pre-policy **requests** via **in-place** Pod updates (no recreate). | Usually zero/low disruption | Cluster supports in-place vertical resizing; prioritize minimal disturbance. | Requires cluster/runtime support for in-place updates. **Limits** unchanged unless otherwise implemented. |
134
149
150
+
For `ReCreate` operations, when attempting to evict a **single-replica** Deployment **without PVCs**, we perform a **rolling update** to avoid service interruption during the update.
151
+
135
152
> **Note:** The `InPlace` mode has certain limitations and may automatically fall back to `ReCreate` in some cases. For details, see [InPlace Limitations](./best_practices_and_limitations#inplace-update-mode-limitations). When unexpected situations prevent us from restoring the Pod Request for 10 minutes, we will allow the configuration to be deleted directly without restoring the Pod Request.
136
153
137
154
## Limit Policy*
138
155
139
156
**LimitPolicy** defines how Pod limits should be reset.
157
+
By default, **we recommend using `RemoveLimit`** to ensure that a Workload can occasionally preempt more resources when needed.
158
+
159
+
When using Multiplier, we suggest setting a reasonable lower bound for `CPU`/`Memory` recommendations. In rare cases (e.g., in testing environments where actual usage is extremely low), the recommended values may not be sufficient for stable Pod startup or handling sudden traffic spikes.
|`RemoveLimit`| Remove Pod `limits` (no CPU/Memory caps). |
144
164
|`KeepLimit`| Keep existing Pod `limits` unchanged. |
145
165
|`Multiplier`| Recalculate `limits` by multiplying with the recommendation request value. |
166
+
167
+
When you modify the `Limit Policy`, an update operation may be triggered. This decision is based on the deviation between the current values and the recommended values, as well as whether existing Pods have their limits set according to the expected configuration. Once the conditions of the `Update Schedule` are met, the update will take effect immediately.
168
+
169
+
> **Note:** When using `KeepLimit`, the final recommended values will never exceed your configured Limits. If you want Pods to be able to use more resources in certain cases, consider using `RemoveLimit` or `Multiplier` instead.
Copy file name to clipboardExpand all lines: src/content/guide/workload_autoscaler/best_practices_and_limitations.mdx
+26-14Lines changed: 26 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,28 +12,40 @@ This document describes best practices for the Workload Autoscaler and the limit
12
12
13
13
Your Kubernetes cluster version must be **1.33 or higher**.
14
14
15
-
### Memory limits (decrease vs. increase)
15
+
### Memory Limits: Decrease vs Increase
16
16
17
-
**Decreasing Memory Limit is not allowed in place.**
18
-
In InPlace mode, the Workload Autoscaler will **not** proactively reduce a Pod’s Memory Limit. Memory Limits are only reassigned when the Pod is **recreated normally**.
17
+
#### 🔻 Decreasing Memory Limit
19
18
20
-
**Increasing Memory Limit may require container restarts.**
21
-
If a workload (e.g., **Java** applications) cannot dynamically adapt to Memory Limit changes, configure the container’s **`ResizePolicy`** so that the **memory** resource is set to **`RestartContainer`**. Attempts to increase the Memory Limit will then automatically **restart** the corresponding container to apply the new limit.
19
+
-**Not supported in `InPlace`.**
20
+
`InPlace` resizing does not allow lowering the Memory limit.
21
+
22
+
-**Fallback behavior:**
23
+
When a new recommendation would reduce an existing Pod’s Memory limit, the Workload Autoscaler automatically falls back to `ReCreate` mode and recreates the Pod.
24
+
25
+
#### 🔺 Increasing Memory Limit
26
+
27
+
-**May require container restarts.**
28
+
Some workloads (e.g., **Java applications**) cannot dynamically adapt to Memory limit changes.
29
+
30
+
-**Best practice:**
31
+
Configure the container’s **`ResizePolicy`** so that the **Memory** resource is set to **`RestartContainer`**.
32
+
In this case, attempts to increase the Memory limit will automatically **restart the container** to apply the new limit.
33
+
34
+
#### Notes
22
35
23
-
**Notes**
24
36
- By default, the Workload Autoscaler sets the **ResizePolicy** for **all resources** of **all containers** to **`NotRequired`**.
25
37
- If you have **manually configured** a container’s ResizePolicy for any resource, the Workload Autoscaler **will not overwrite** it. For details, see the
A Pod’s QoS class is determined at creation time (one of **Guaranteed**, **Burstable**, or **BestEffort**). InPlace updates must **not** cause a change in QoS class:
42
+
A Pod’s QoS class is determined at creation time (one of **Guaranteed**, **Burstable**, or **BestEffort**). `InPlace` updates must **not** cause a change in QoS class:
31
43
32
-
-**BestEffort Pods** (no CPU/memory requests or limits at startup): You **cannot** add any CPU/memory requests or limits, because adding requests would convert the Pod to **Burstable**, which is **not allowed** in InPlace updates. Therefore, BestEffort Pods **cannot** use in-place vertical scaling. If you need scaling, specify requests at creation time so the Pod is at least Burstable.
33
-
-**Guaranteed Pods** (for every container, CPU and memory**requests equal limits**): After InPlace adjustments, each container must still satisfy **`requests == limits`**. To increase or decrease CPU/memory, you must update **both** request **and** limit to the **same** value. For example, going from 2 CPU to 3 CPU requires setting **both** request and limit to 3. You cannot change only one of them, or the Pod will no longer be Guaranteed.
34
-
-**Burstable Pods** (have requests, but not all equal to limits, or some containers may have no requests): You may adjust CPU/memory, but **must not** turn the Pod into Guaranteed. It is forbidden to make **both CPU and memory requests equal to their limits** across all containers after the update; otherwise the Pod would become Guaranteed. You also must not clear all requests and turn the Pod into BestEffort. In short, the Pod **must keep its original QoS class** unchanged.
44
+
-**BestEffort Pods** (no CPU/Memory requests or limits at startup): You **cannot** add any CPU/Memory requests or limits, because adding requests would convert the Pod to **Burstable**, which is **not allowed** in `InPlace` updates. Therefore, BestEffort Pods **cannot** use in-place vertical scaling. If you need scaling, specify requests at creation time so the Pod is at least Burstable.
45
+
-**Guaranteed Pods** (for every container, CPU and Memory**requests equal limits**): After `InPlace` adjustments, each container must still satisfy **`requests == limits`**. To increase or decrease CPU/Memory, you must update **both** request **and** limit to the **same** value. For example, going from 2 CPU to 3 CPU requires setting **both** request and limit to 3. You cannot change only one of them, or the Pod will no longer be Guaranteed.
46
+
-**Burstable Pods** (have requests, but not all equal to limits, or some containers may have no requests): You may adjust CPU/Memory, but **must not** turn the Pod into Guaranteed. It is forbidden to make **both CPU and Memory requests equal to their limits** across all containers after the update; otherwise the Pod would become Guaranteed. You also must not clear all requests and turn the Pod into BestEffort. In short, the Pod **must keep its original QoS class** unchanged.
35
47
36
-
If an InPlace operation would violate any of the above QoS rules, the Workload Autoscaler **falls back to `ReCreate` mode** and explicitly recreates (re-schedules) the target Pod.
48
+
If an `InPlace` operation violates any of the above QoS rules, the Workload Autoscaler **falls back to `ReCreate` mode** and explicitly recreates (re-schedules) the target Pod.
37
49
38
50
> **Note:** Such fallback events are expected to occur only when a Workload is first configured with a AutoscalingPolicy or when certain related configurations of the AutoscalingPolicy are modified. They should not occur during normal operation.
39
51
@@ -45,7 +57,7 @@ In this scenario, the Workload Autoscaler will **fall back to `ReCreate` mode**
45
57
46
58
### Coexisting with HPA
47
59
48
-
Using the Workload Autoscaler **together** with **HPA (Horizontal Pod Autoscaler)** can produce unexpected behavior. If you need both, configure them to manage **different resources**—for example, let HPA scale by **CPU usage**, while the Workload Autoscaler adjusts only **memory**.
60
+
Using the Workload Autoscaler **together** with **HPA (Horizontal Pod Autoscaler)** can produce unexpected behavior. If you need both, configure them to manage **different resources**—for example, let HPA scale by **CPU usage**, while the Workload Autoscaler adjusts only **Memory**.
49
61
50
62
## Best Practices
51
63
@@ -57,6 +69,6 @@ Whenever possible, set **resource requests** for every container in all workload
57
69
58
70
Avoid specifying **limits** whenever feasible. Instead, set **requests** to place Pods in the **Burstable** QoS class.
59
71
60
-
### Set a restart policy for workloads that cannot adapt memory InPlace
72
+
### Set a restart policy for workloads that cannot adapt Memory InPlace
61
73
62
-
For workloads like **Java** that cannot adjust to Memory Limit changes dynamically, manually configure the container’s **`ResizePolicy`** so that when the InPlace update modifies the Memory Limit, the **container will restart** to apply the new limit (set memory ResizePolicy to **`RestartContainer`**).
74
+
For workloads like **Java** that cannot adjust to Memory Limit changes dynamically, manually configure the container’s **`ResizePolicy`** so that when the `InPlace` update modifies the Memory Limit, the **container will restart** to apply the new limit (set Memory ResizePolicy to **`RestartContainer`**).
Copy file name to clipboardExpand all lines: src/content/guide/workload_autoscaler/installation.mdx
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -74,3 +74,17 @@ Similarly, the `Workload Autoscaler` component is uninstalled together with the
74
74
to uninstall the `Workload Autoscaler` component independently, please contact our technical support team for assistance.
75
75
76
76
> **Note:** Before uninstalling the Workload Autoscaler, please make sure that all AutoscalingPolicies have been deleted or disabled, and confirm that all Workloads have been restored to their original state.
77
+
78
+
## Configure the Update/Evict Limiter
79
+
80
+
By default, the **Workload Autoscaler** enables a **Limiter** that throttles the number of **in-place updates** and **Pod evictions**. This helps prevent large clusters from becoming unstable when many Pods are updated or evicted in a short period.
81
+
82
+
You can tune the Limiter with the environment variables below. If not set, the defaults apply.
|`LIMITER_QUOTA_PER_WINDOW`|**5**| Tokens added to the bucket each window. |
87
+
|`LIMITER_BURST`|**10**| Maximum tokens allowed in the bucket (peak operations within a window). |
88
+
|`LIMITER_WINDOW_SECONDS`|**30**| Window length in seconds; every window adds `LIMITER_QUOTA_PER_WINDOW` tokens. |
89
+
90
+
> **Note:** For eviction operations, when attempting to evict a **single-replica** Deployment **without PVCs**, we perform a **rolling update** to avoid service interruption during the update.
Copy file name to clipboardExpand all lines: src/content/guide/workload_autoscaler/recommendation_policy.mdx
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ It allows you to define the range of recommendation values, enabling more flexib
9
9
10
10
This document explains the meaning and valid range of each field in the `Recommendation Policy`.
11
11
12
-
> **Note:**Fields marked with * are required.
12
+
> **Note:**For all containers, if the recommended values are below the minimums, the system automatically raises them to: CPU `20m` and Memory `20Mi`. This ensures that resource requests never fall below safe operational thresholds.
13
13
14
14
## Strategy Type*
15
15
@@ -86,7 +86,6 @@ For critical workloads, you can set it to 7 days to ensure recommendations accou
You can set both `Min` and `Max` limits for `CPU` and `Memory`. This ensures that the recommended values will not fall below or exceed the range you define.
89
-
90
89
The `Max limit` is applied after the `Buffer` is calculated, meaning the final recommended value (including the `Buffer`) will not exceed the `Max limit`.
91
90
92
91
For `Resource Limits`, you can use either percentages or absolute values:
@@ -100,6 +99,8 @@ In most cases, we recommend ***using percentages so the system can adjust based
100
99
For example, if you set `CPU` to `30%` ~ `200%`, the final recommended value will never be lower than `30%` of the original Request,
101
100
nor higher than `200%` of the original Request.
102
101
102
+
We strongly recommend that you configure Min limits for both `CPU` and `Memory` resources to prevent recommended values from being too low in certain cases, which could cause Pods to fail to run properly.
103
+
103
104
> **Note:** When using percentages for `Resource Limits`, you must ensure that all containers within the workloads governed by this `Recommendation Policy` have defined Request values for the corresponding resource. Otherwise, the system will not be able to calculate a recommendation.
0 commit comments