Skip to content

Conversation

@jglick
Copy link
Member

@jglick jglick commented Dec 12, 2018

Recent builds seem to be running this test over and over, finally timing out. I want to know if there is something about this test, or if there is some infrastructure change, etc. (I have not been able to track the issue down to a specific source commit.)

@jglick jglick added the work-in-progress The PR is under active development, not ready to the final review label Dec 12, 2018
@jglick jglick requested a review from batmat December 12, 2018 03:29
@batmat batmat closed this Dec 12, 2018
@batmat batmat reopened this Dec 12, 2018
@batmat
Copy link
Member

batmat commented Dec 12, 2018

@jglick seems like the first run succeeded. I just closed/reopened to retrigger and see if this behaves differently.

How about use the way I used in #3788 and run only this test multiple times and see what it gives?

@jglick
Copy link
Member Author

jglick commented Dec 12, 2018

Well build 1 passed indeed, but wound up running JenkinsBuildsAndWorkspacesDirectoriesTest a dozen times. So something is weird with the Surefire runner it seems.

https://ci.jenkins.io/job/Core/job/jenkins/job/master/1179/execution/node/38/log/?consoleFull
[WARNING] Corrupted stdin stream in forked JVM 2. See the dump file /home/jenkins/workspace/Core_jenkins_master-CZB5BLBJFXZE2BKR63DSXQKXRZJAOWBX73QP55GU5KMX7AHAF6CQ/test/target/surefire-reports/2018-12-12T10-57-04_077-jvmRun2.dumpstream
@jglick
Copy link
Member Author

jglick commented Dec 12, 2018

#3772 suggests that tests calling Jenkins.restart are the issue, as JenkinsBuildsAndWorkspacesDirectoriesTest does. I think I know this could be fixed.

stage("${buildType} Publishing") {
if (runTests) {
junit healthScaleFactor: 20.0, testResults: '*/target/surefire-reports/*.xml'
archiveArtifacts allowEmptyArchive: true, artifacts: '**/target/surefire-reports/*.dumpstream'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, wasn''t aware of this feature. NIT then reading https://maven.apache.org/surefire/maven-surefire-plugin/faq.html#dumpfiles don't we actually want .dump in case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what *.dump is exactly, but anyway an actual *.dumpstream file led me to a source code issue. Can always improve as needed.

@jglick
Copy link
Member Author

jglick commented Dec 12, 2018

I think I know this could be fixed.

Hmm, my idea was reuseForks=false as in jenkinsci/workflow-cps-plugin#178, but test-pom already does that!

@jglick
Copy link
Member Author

jglick commented Dec 12, 2018

I cannot reproduce the error noted in #3772 running that single test, but weird behavior appears once you start running other tests too:

mvn -f test surefire:test -Dtest=DisablePluginCommandTest,JenkinsBuildsAndWorkspacesDirectoriesTest,NodesTest,TransientActionFactoryTest -Pall-tests
[INFO] Running hudson.cli.DisablePluginCommandTest
[INFO] Running jenkins.model.JenkinsBuildsAndWorkspacesDirectoriesTest
[INFO] Running hudson.cli.DisablePluginCommandTest
..
[INFO] Results:
[INFO] 
[WARNING] Flakes: 
[WARNING] hudson.cli.DisablePluginCommandTest.canDisableDependentPluginWrongOrderStrategyAll(hudson.cli.DisablePluginCommandTest)
[INFO]   Run 1: PASS
[ERROR]   Run 2: DisablePluginCommandTest.canDisableDependentPluginWrongOrderStrategyAll
[INFO]   Run 3: PASS
[INFO] 
[WARNING] hudson.cli.DisablePluginCommandTest.disablePluginsStrategyAll(hudson.cli.DisablePluginCommandTest)
[ERROR]   Run 1: DisablePluginCommandTest.disablePluginsStrategyAll » Reactor java.lang.Error: ...
[INFO]   Run 2: PASS
[INFO]   Run 3: PASS
[INFO] 
[WARNING] jenkins.model.JenkinsBuildsAndWorkspacesDirectoriesTest.externalBuildDirectorySymlinks(jenkins.model.JenkinsBuildsAndWorkspacesDirectoriesTest)
[ERROR]   Run 1: JenkinsBuildsAndWorkspacesDirectoriesTest.externalBuildDirectorySymlinks
[INFO]   Run 2: PASS
[INFO] 
[INFO] 
[WARNING] Tests run: 65, Failures: 0, Errors: 0, Skipped: 0, Flakes: 3

@jglick
Copy link
Member Author

jglick commented Dec 12, 2018

I think the problem is that the test JVM is getting a UnixLifecycle. Using RestartableJenkinsRule is fine since the restarts are purely in application code.

@jglick jglick changed the title [DO NOT MERGE] Timeout diagnosis Fixing tests which attempted to restart the Surefire booter JVM Dec 12, 2018
@jglick jglick removed the work-in-progress The PR is under active development, not ready to the final review label Dec 12, 2018
@jglick
Copy link
Member Author

jglick commented Dec 12, 2018

As an aside, I just noticed that a59814e puts the burden of calling cleanUp on each Lifecycle implementation, which seems silly—you would expect this to be part of Jenkins.restart/safeRestart.

Copy link
Contributor

@MRamonLeon MRamonLeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean use RestartableJenkinsRule in the DisablePluginCommandTest and remove the Ignores @jglick? I didn't use it because EnablePluginCommandTest was so and I didn't need to do things on Jenkins after the restart.

@jglick
Copy link
Member Author

jglick commented Dec 12, 2018

Do you mean use RestartableJenkinsRule in the DisablePluginCommandTest

No, because RestartableJenkinsRule is used when the restart is initiated by the test directly (by means of separating it into blocks), whereas these tests are calling some code which in turn calls Jenkins.restart outside of the test’s direct control.

…ducible locally).

Amends code added in jenkinsci#2082.
java.lang.RuntimeException: Unexpected issues encountered during cleanUp: offset 0, count -1, length 5
	at jenkins.model.Jenkins.cleanUp(Jenkins.java:3319)
	at org.jvnet.hudson.test.JenkinsRule.after(JenkinsRule.java:495)
	at org.jvnet.hudson.test.JenkinsRule$1.evaluate(JenkinsRule.java:565)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.StringIndexOutOfBoundsException: offset 0, count -1, length 5
	at java.base/java.lang.String.checkBoundsOffCount(String.java:3304)
	at java.base/java.lang.String.<init>(String.java:465)
	at hudson.TcpSlaveAgentListener$PingAgentProtocol.connect(TcpSlaveAgentListener.java:416)
	at hudson.TcpSlaveAgentListener.shutdown(TcpSlaveAgentListener.java:196)
	at jenkins.model.Jenkins._cleanUpShutdownTcpSlaveAgent(Jenkins.java:3548)
	at jenkins.model.Jenkins.cleanUp(Jenkins.java:3294)
	... 6 more
Copy link
Member

@batmat batmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's merge once we get a green build.

I'll see to be if we can analyze all these through https://issues.jenkins-ci.org/browse/JENKINS-55122 once we get the situation back in control.

stage("${buildType} Publishing") {
if (runTests) {
junit healthScaleFactor: 20.0, testResults: '*/target/surefire-reports/*.xml'
archiveArtifacts allowEmptyArchive: true, artifacts: '**/target/surefire-reports/*.dumpstream'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, wasn''t aware of this feature. NIT then reading https://maven.apache.org/surefire/maven-surefire-plugin/faq.html#dumpfiles don't we actually want .dump in case?

socket.getRemoteSocketAddress(),
new String(ping, "UTF-8"),
new String(response, 0, responseLength, "UTF-8")
responseLength > 0 && responseLength <= response.length ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR this fix is for UsageStatisticsTest last failure:

image

@batmat batmat added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label Dec 13, 2018
@jglick jglick merged commit b826edb into jenkinsci:master Dec 13, 2018
@jglick jglick deleted the timeout-diagnosis branch December 13, 2018 16:39
@batmat
Copy link
Member

batmat commented Dec 29, 2018

@LinuxSuRen in short, if you need that very fix, there's none.
Where are you seeing it? Thanks

@LinuxSuRen
Copy link
Member

During test junit test of all my plugins. I wonder that shall we should fix this in the older LTS? @batmat

@batmat
Copy link
Member

batmat commented Dec 29, 2018

You mean when running your plugin's builds using Java 11? Sorry I am trying to make sure I get it.

Anyway if you have a case for backporting this, this will need a JIRA labelled 'lts-candidate' and explanarions as to why it would be backported.

Cheers

@LinuxSuRen
Copy link
Member

I run my junit test from the IDEA. Java version is 8.

@batmat
Copy link
Member

batmat commented Dec 30, 2018 via email

@LinuxSuRen
Copy link
Member

They're the same results. You can test it by using my repo. The link is here https://github.com/LinuxSuRen/alauda-devops-pipeline-plugin/tree/add-test-case.

@batmat
Copy link
Member

batmat commented Dec 30, 2018

@LinuxSuRen the failure I see locally seems totally unrelated, just the typical issue with a wrong version of a plugin that prevents Jenkins from starting up.

[INFO] Running InjectedTest
[ERROR] Tests run: 23, Failures: 0, Errors: 5, Skipped: 0, Time elapsed: 7.448 s <<< FAILURE! - in InjectedTest
[ERROR] testPluginActive(org.jvnet.hudson.test.PluginAutomaticTestBuilder$OtherTests)  Time elapsed: 0.006 s  <<< ERROR!
java.lang.Error: Plugin workflow-basic-steps failed to start
...

I would recommend the following:

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants