Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance, eliminate ConcurrentModificationException and eliminate stucks #558

Merged
merged 99 commits into from
Jan 17, 2024

Conversation

mPokornyETM
Copy link
Contributor

@mPokornyETM mPokornyETM commented Sep 7, 2023

fix #550
fix #503
fix #149
fix #566
fix #597
close #151

In our company we use many resources (more the 400) and many jobs (100 ~ 1000) in one Jenkins instance.
We try to upgrade to latest Jenkins LTS and we recognize, that our jobs are often frozen. Also allocating existing resources takes really long time (sometimes minutes)
After the first analysis, we recognized, that it got stuck by lockable-resources-plugin. Because (and this is the paradox) the Jenkins is faster, then before.
Therefore it was necessary to refactor synchronization in this plugin.
As first step we provided a pressure-test, that can simulate many jobs, many nodes, many resources in many parallel stages.
This was a good idea, because we can vary time (and also with older Jenkins versions) simulate the ConcurrentModificationException raises and also we can simulate, when plugin get stuck.

Implementing of synchronization solve both issues, but the plugin being very slowly. Therefore we tried to improve performance. Currently (in this PR) it looks pretty fine.

We will be happy, when somebody else can review this changes. Probably it will be hard, because there are many.

PS: due many changes is the review hard. Therefore I decide to split it to smaller PRs. There are many other fixes as well.
After them it shall be much more easy.

Testing done

Added new pressure-test. All other test are OK on Windows and Unix

Proposed upgrade guidelines

This change may break your pipelines, when you access LockableResourcesManager or LockableResource from groovy, because all this operation can not be synchronized correctly. Please check your pipeline (or shared library) code, before you upgrade to this version.

TODO

still open

  • trim spaces from resource name, otherwise it will stuck
  • throw error when count of requiered resources is bigger then current count
  • unqueue all items contains the buildm, when the build has been canceled (to improve performance)
  • sort itmes in the queue to eliminate inverse precedense issues and make the search algoritmus faster.
  • check whay the searching takes still too long. When the build has over 100 items it takes also on good server over 3 minutes

Submitter checklist

  • The Jira / Github issue, if it exists, is well-described.
  • The changelog entries and upgrade guidelines are appropriate for the audience affected by the change (users or developers, depending on the change) and are in the imperative mood (see examples).
    • The changelog generator for plugins uses the pull request title as the changelog entry.
    • Fill in the Proposed upgrade guidelines section only if there are breaking changes or changes that may require extra steps from users during the upgrade.
  • There is automated testing or an explanation that explains why this change has no tests.
  • New public functions for internal use only are annotated with @NoExternalUse. In case it is used by non java code the Used by {@code <panel>.jelly} Javadocs are annotated.
  • [ ] New or substantially changed JavaScript is not defined inline and does not call eval to ease the future introduction of Content Security Policy (CSP) directives (see documentation).
  • [ ] For dependency updates, there are links to external changelogs and, if possible, full differentials.
  • [ ] For new APIs and extension points, there is a link to at least one consumer.
  • Any localizations are transferred to *.properties files.
  • [ ] Changes in the interface are documented also as examples.

Maintainer checklist

Before the changes are marked as ready-for-merge:

  • There is at least one (1) approval for the pull request and no outstanding requests for change.
  • Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
  • Changelog entries in the pull request title and/or Proposed changelog entries are accurate, human-readable, and in the imperative mood.
  • Proper changelog labels are set so that the changelog can be generated automatically. See also release-drafter-labels.
  • If the change needs additional upgrade steps from users, the upgrade-guide-needed label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).
  • java code changes are tested by automated test.

@lhoehnETM many thx for support ;-)

Copy link
Contributor

@offa offa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the latest release ends in a deadlock within hours, this PR works fine so far! 👍

@mPokornyETM mPokornyETM marked this pull request as ready for review January 17, 2024 21:06
@mPokornyETM mPokornyETM added this to the Feature committed milestone Jan 17, 2024
@mPokornyETM
Copy link
Contributor Author

While the latest release ends in a deadlock within hours, this PR works fine so far! 👍

@offa thx for your confirmation, I will merge this changes, when the last commit has been checked.

@mPokornyETM mPokornyETM disabled auto-merge January 17, 2024 22:19
@mPokornyETM mPokornyETM merged commit 6a4cfee into jenkinsci:master Jan 17, 2024
15 checks passed
@offa offa mentioned this pull request Jan 19, 2024
15 tasks
@clockrun
Copy link

Hi mPokornyETM,
After update to version 1132, I'm still experiencing this issue on Jenkins startup.

java.util.ConcurrentModificationException
	at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1095)
	at java.base/java.util.ArrayList$Itr.next(ArrayList.java:1049)
	at org.jenkins.plugins.lockableresources.FreeDeadJobs.freePostMortemResources(FreeDeadJobs.java:33)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
Caused: java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:118)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:109)
Caused: java.lang.Error
	at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:115)
	at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:185)
	at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:305)
	at jenkins.model.Jenkins$5.runTask(Jenkins.java:1170)
	at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:221)
	at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:120)
	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused: org.jvnet.hudson.reactor.ReactorException
	at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:290)
	at jenkins.InitReactorRunner.run(InitReactorRunner.java:49)
	at jenkins.model.Jenkins.executeReactor(Jenkins.java:1205)
	at jenkins.model.Jenkins.<init>(Jenkins.java:992)
	at hudson.model.Hudson.<init>(Hudson.java:86)
	at hudson.model.Hudson.<init>(Hudson.java:82)
	at hudson.WebAppMain$3.run(WebAppMain.java:247)
Caused: hudson.util.HudsonFailedToLoad
	at hudson.WebAppMain$3.run(WebAppMain.java:264)

@clockrun
Copy link

I created #623 for this

@mPokornyETM
Copy link
Contributor Author

Hi
in all version bevore 1232.v512d6c434eb_d it might happens.
The current release (1232.v512d6c434eb_d) shall fixed it,
BR

@mPokornyETM mPokornyETM deleted the perf-optimize branch March 19, 2024 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment