Skip to content

Conversation

@Xaseron
Copy link

@Xaseron Xaseron commented Nov 12, 2018

See JENKINS-54607.

This PR enables the intended behaviour that the timeout step sleeps for a predefined GRACE_PERIOD till the body will be forcefully canceled.
Otherwise a timeout(10) results in waiting 10 additional minutes till the execution will be forcefully canceled

@abayer
Copy link
Member

abayer commented Nov 12, 2018

Can you add a test reproducing the original problem this fixes? Thanks!

@Xaseron
Copy link
Author

Xaseron commented Nov 13, 2018

I've created a test case that failes with the original code and passes with the patch.

Copy link
Member

@dwnusbaum dwnusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'll check for an open Jira and create one if needed.

WorkflowJob p = story.j.jenkins.createProject(WorkflowJob.class, "p");
p.setDefinition(new CpsFlowDefinition("timeout(time: 15, unit: 'SECONDS') {unkillable()}", true));
story.j.assertBuildStatus(Result.ABORTED, p.scheduleBuild2(0).get());
assert p.getLastBuild().getDuration() < 30_000; // 30s
Copy link
Member

@dwnusbaum dwnusbaum Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You should use one of JUnit's assertion methods instead of a bare Java-level assert for better error messages in the event of failure, and so that the tests fail even if run without -ea:

Suggested change
assert p.getLastBuild().getDuration() < 30_000; // 30s
// Need to add `import static org.hamcrest.Matchers.lessThan;`
assertThat(p.getLastBuild().getDuration(), lessThan(30_000L)); // 30s

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe clearer to use 22-23 seconds, since the grace period is 5 seconds in unit tests, we expect things to be canceled after 15 seconds + 5 seconds. Without the fix the original code would finish after around 31-32 seconds, so it is very close to passing in both scenarios.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duration takes the startup time into account. 23s would not be enough.
I've printed the duration and it is 23110 ms. I changed the expected duration to be less than 29 seconds.

[INFO] Running org.jenkinsci.plugins.workflow.steps.TimeoutStepTest
=== Starting gracePeriod(org.jenkinsci.plugins.workflow.steps.TimeoutStepTest)
   0.154 [id=16]	INFO	o.jvnet.hudson.test.WarExploder#explode: Picking up existing exploded jenkins.war at /home/mcwr/work/workflow-basic-steps-plugin/target/jenkins-for-test
   1.050 [id=16]	INFO	o.jvnet.hudson.test.JenkinsRule#createWebServer: Running on http://localhost:37759/jenkins/
   2.069 [id=23]	INFO	jenkins.InitReactorRunner$1#onAttained: Started initialization
   3.487 [id=33]	WARNING	hudson.ClassicPluginStrategy#createClassJarFromWebInfClasses: Created /tmp/jenkins1976389466714786762tmp/ant/WEB-INF/lib/classes.jar; update plugin to a version created with a newer harness
   3.555 [id=35]	INFO	jenkins.InitReactorRunner$1#onAttained: Listed all plugins
   3.608 [id=37]	INFO	h.plugins.ansicolor.PluginImpl#start: AnsiColor: eliminating boring output (https://github.com/dblock/jenkins-ansicolor-plugin)
   7.143 [id=36]	INFO	jenkins.InitReactorRunner$1#onAttained: Prepared all plugins
   7.145 [id=29]	INFO	jenkins.InitReactorRunner$1#onAttained: Started all plugins
   7.154 [id=29]	INFO	jenkins.InitReactorRunner$1#onAttained: Augmented all extensions
   9.209 [id=37]	INFO	jenkins.InitReactorRunner$1#onAttained: Loaded all jobs
   9.863 [id=31]	INFO	jenkins.InitReactorRunner$1#onAttained: Completed initialization
  10.216 [p #1] Started
  13.393 [p #1] [Pipeline] timeout
  13.393 [p #1] Timeout set to expire in 15 sec
  13.394 [p #1] [Pipeline] {
  13.446 [p #1] [Pipeline] unkillable
  28.577 [p #1] Cancelling nested steps due to timeout
  28.578 [p #1] ignoring org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
  33.185 [id=65]	INFO	o.j.p.workflow.job.WorkflowRun#finish: p #1 completed: ABORTED
  33.195 [p #1] Body did not finish within grace period; terminating with extreme prejudice
  33.195 [p #1] [Pipeline] }
  33.196 [p #1] [Pipeline] // timeout
  33.196 [p #1] [Pipeline] End of Pipeline
  33.229 [p #1] Timeout has been exceeded
  33.229 [p #1] Finished: ABORTED
Duration: 23110 ms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a few runs and got durations of 21.518s, 21.217s, 20.881 and 20.882s, so it seems like the timing is pretty variable, maybe just based on OS differences or current machine load. Thanks for making the assertion change!

body.cancel(new ExceededTimeout());
forcible = true;
long now = System.currentTimeMillis();
end = now + GRACE_PERIOD;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the previous code would set end = now + GRACE_PERIOD, and then the call to resetTimer would set end = now + timeout, so this was effectively dead code and the grace period was the timeout.

@dwnusbaum dwnusbaum changed the title Use Grace_Period for timeout [JENKINS-54607] Use Grace_Period for timeout Nov 13, 2018
@dwnusbaum dwnusbaum merged commit 8fa1262 into jenkinsci:stable Nov 14, 2018
@dwnusbaum
Copy link
Member

dwnusbaum commented Nov 14, 2018

Hmm, there might be some merge conflicts when merging from stable into master because of #74, but hopefully only minor changes will be necessary. Maybe time for us to get rid of the stable branch entirely?

dwnusbaum added a commit that referenced this pull request Nov 30, 2018
[JENKINS-54607] Merge #76 from stable into master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants