Skip to content

Conversation

@jglick
Copy link
Member

@jglick jglick commented Jan 17, 2025

Revealed by a CloudBees CI test failing after jenkinsci/workflow-api-plugin#368:

java.lang.AssertionError: Synchronizing on WorkflowRun before metadataGuard may cause deadlocks
	at PluginClassLoader for workflow-job//org.jenkinsci.plugins.workflow.job.WorkflowRun.getMetadataGuard(WorkflowRun.java:219)
	at PluginClassLoader for workflow-job//org.jenkinsci.plugins.workflow.job.WorkflowRun.doKill(WorkflowRun.java:495)
	at PluginClassLoader for workflow-job//org.jenkinsci.plugins.workflow.job.WorkflowRun.httpKill(WorkflowRun.java:513)
	at PluginClassLoader for workflow-job//org.jenkinsci.plugins.workflow.job.WorkflowRun.doStop(WorkflowRun.java:892)

Here getOneOffExecutor() is null immediately after startup, before the build has had a chance to resume.

I am not sure if this could have happened previously in realistic timing conditions, perhaps under more exotic conditions involving unloadable builds, but anyway from code inspection doStop was indeed acquiring a lock in the wrong order when this clause was hit: this code path could not have worked (for years at least, did not look up how long exactly).

@jglick jglick requested a review from a team as a code owner January 17, 2025 12:49
@jglick jglick added the bug label Jan 17, 2025
@jglick jglick merged commit 33a0c6f into jenkinsci:master Jan 17, 2025
17 checks passed
@jglick jglick deleted the WorkflowRun.doStop branch January 17, 2025 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants