-
Notifications
You must be signed in to change notification settings - Fork 200
Be quieter handling CpsFlowExecution.owner == null in suspendAll
#788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @Restricted(DoNotUse.class) | ||
| @Terminator(attains = FlowExecutionList.EXECUTIONS_SUSPENDED) | ||
| public static void suspendAll() { | ||
| CpsFlowExecution exec = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outermost catch clause seemed redundant. (Timeout.close does not throw exceptions, so it was not for that.)
| try { | ||
| if (execution instanceof CpsFlowExecution) { | ||
| CpsFlowExecution cpsExec = (CpsFlowExecution)execution; | ||
| if (execution instanceof CpsFlowExecution) { | ||
| CpsFlowExecution cpsExec = (CpsFlowExecution) execution; | ||
| try { | ||
| cpsExec.checkAndAbortNonresumableBuild(); | ||
|
|
||
| LOGGER.log(Level.FINE, "waiting to suspend {0}", execution); | ||
| exec = (CpsFlowExecution) execution; | ||
| // Like waitForSuspension but with a timeout: | ||
| if (exec.programPromise != null) { | ||
| if (cpsExec.programPromise != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplifying
| }); | ||
| } | ||
| cpsExec.getOwner().getListener().getLogger().close(); | ||
| if (cpsExec.owner != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea how this would be reachable? Corruption during resumption or something? Perhaps jenkinsci/workflow-api-plugin#304 has made it possible? If I understand right, in this case FlowExecutionList.runningTasks contains a FlowExecutionOwner which successfully returns a FlowExecution from .get(), but for which FlowExecution.owner on the returned object is null, which seems problematic and unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid I do not know—just saw this in a log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose [Workflow]Run.reload deserializes execution, but then WorkflowRun.onLoad fails before calling getExecution, or getExecution fails before calling fetchedExecution.onLoad(new Owner(this)), or unmarshal fails partway through, etc. Presumably the build was badly corrupted somehow. The point of this PR is just to avoid unnecessary stack traces after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth though I saw another case of a CpsFlowExecution with a null owner recently, which is why I am wondering if something has changed things:
java.lang.IllegalStateException: List of flow heads unset for CpsFlowExecution[null]
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.getCurrentHeads(CpsFlowExecution.java:1018)
... insignificant, just a user loading some build page that accessed the heads for an execution ...
I looked at the XML for that execution on disk (based on the thread name) and head was non-null and pointed to a FlowEndNode that did exist in workflow/, so it wasn't obvious what might have gone wrong.
| }); | ||
| } | ||
| cpsExec.getOwner().getListener().getLogger().close(); | ||
| if (cpsExec.owner != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea how this would be reachable? Corruption during resumption or something? Perhaps jenkinsci/workflow-api-plugin#304 has made it possible? If I understand right, in this case FlowExecutionList.runningTasks contains a FlowExecutionOwner which successfully returns a FlowExecution from .get(), but for which FlowExecution.owner on the returned object is null, which seems problematic and unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@jtnord I believe it is unrelated. |
See #669 (comment). I did in fact observe