You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/docs/architecture/timeouts.md
+32-4Lines changed: 32 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,13 +64,14 @@ In OpenShift CI, this timeout and grace period apply to the `ci-operator` orches
64
64
65
65
```yaml
66
66
plank: # Prow's controller to launch Pods for jobs
67
-
default_decoration_configs:
68
-
'*':
69
-
grace_period: 30m0s
67
+
default_decoration_config_entries:
68
+
- config:
69
+
grace_period: 1h0m0s
70
70
timeout: 4h0m0s
71
-
'org/repo': # overwrite the job timeout at repo level
71
+
- config:
72
72
grace_period: 45m0s
73
73
timeout: 6h0m0s
74
+
repo: org1/repo1 # overwrite the job timeout at repo level
74
75
```
75
76
76
77
In special cases, long-running, generated jobs can raise the cap with job-specific configuration [like][generated-timeout-example]:
@@ -148,6 +149,33 @@ ref:
148
149
The `pod.spec.activeDeadlineSeconds` setting on a `Pod` only implicitly bounds the amount of time that a `Pod` executes for on a Kubernetes cluster. The active deadline begins at the first moment that a `kubelet` acknowledges the `Pod`, which is after it is scheduled to a specific node but before it pulls images, sets up a container sandbox, _etc_. It is therefore possible to exceed the active deadline without ever having a container in the `Pod` execute. Please see the [API documentation](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#podspec-v1-core) for more details. For these reasons, no timeout configured in the system makes use of this setting, instead relying on a thin wrapper around the executing code that's injected by Prow itself.
149
150
{{< /alert >}}
150
151
152
+
#### How to configure a customized timeout
153
+
154
+
If you need a longer timeout than the default 24 hours, but no more than 72 hours,
155
+
At [repository level](https://github.com/openshift/release/blob/6a5999d35c9bedca66a608cf5a9a2ad6bff49712/core-services/prow/02_config/_config.yaml#L442), add a `config` section for your repo as below,
156
+
```yaml
157
+
plank:
158
+
default_decoration_config_entries:
159
+
...
160
+
- config:
161
+
grace_period: 1h30m0s
162
+
timeout: 36h0m0s
163
+
repo: org2/repo2 # overwrite the job timeout at repo level
164
+
```
165
+
At [job level](https://github.com/openshift/release/blob/5f3a72424aeee5027525e6dd471235139ef77108/ci-operator/config/openshift/release/openshift-release-master__ci-4.21.yaml#L88), add a `timeout` field for your job as below,
166
+
```yaml
167
+
- as: any-job-name-you-have
168
+
interval: 4h
169
+
steps:
170
+
cluster_profile: aws-2
171
+
workflow: openshift-upgrade-aws-ovn
172
+
timeout: 36h0m0s
173
+
```
174
+
175
+
{{< alert title="Note" color="info" >}}
176
+
If you use a longer timeout, you might also need to reach to [DPP team](https://devservices.dpp.openshift.com/support/) to make sure your cloud account allows running OCP clusters longer than this timeout.
177
+
{{< /alert >}}
178
+
151
179
## How Interruptions May Be Handled
152
180
153
181
Two main approaches exist to handling interruptions for a test process: first, the test process itself may listen for and handle `SIGTERM`; second, `post` steps may be declared in a test `workflow` to be run after an interruption occurs. The first approach is most useful when relevant state for responding to the interrupt exists only in the test process itself, and the response is fairly short. This approach has the downside of requiring complex test process code and signal handling implementation. The second approach is suggested as it is more robust and tunable. In this approach, state needed to respond to the interrupt should be stored in the [`${SHARED_DIR}`](/docs/architecture/step-registry/#sharing-data-between-steps) for use by the `post` step. The `post` step may be marked as [best-effort](/docs/architecture/step-registry/#marking-post-steps-best-effort) if it only gathers artifacts or cleans up resources. Examples of both approaches follow.
0 commit comments