Skip to content

GH-2809: Integration tests for JenaSystem#init #3203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

arne-bdt
Copy link
Contributor

The test code triggered a "Deadlock in JenaSystem.init()" (GH-2787 #2787) in Jena 5.2.0.
--> It seems to be stable now.

GitHub issue resolved #2809

Pull request Description:
The tests in #2809 reliably reproduced the Deadlock in JenaSystem.init() from GH-2787.
--> Currently these tests don´t seem to trigger any deadlock. They should be useful to keep it that way.

Since the output of JMH is set to SILENT, there is no verbose output, if the tests run successfully.

Downside: These tests add about 7 seconds to the test execution.


By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

The test code triggered a "Deadlock in JenaSystem.init()" (apacheGH-2787 apache#2787) in Jena 5.2.0.
--> It seems to be stable now.
@arne-bdt arne-bdt marked this pull request as draft May 19, 2025 05:29
@arne-bdt
Copy link
Contributor Author

The maven build failed with:

Error:  Tests run: 58, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 0.193 s <<< FAILURE! -- in org.apache.jena.test.rdfconnection.TestRDFConnectionFuseki
Error:  org.apache.jena.test.rdfconnection.TestRDFConnectionFuseki.named_graph_load_remote_4 -- Time elapsed: 0.004 s <<< ERROR!

Unfortunately, I cannot reproduce the error on my machine.

I can't see, how or why this might have something to do with my PR...

Does anyone have an idea?

@afs
Copy link
Member

afs commented May 21, 2025

Downside: These tests add about 7 seconds to the test execution.

7 seconds isn't long.

We can at least add these to the test suite - we may then decide not to run them everytime.

@afs
Copy link
Member

afs commented May 21, 2025

Unfortunately, I cannot reproduce the error on my machine.

I can't see, how or why this might have something to do with my PR...

Does anyone have an idea?

This is a github-only error. The GH build servers can be assumed to be heavily loaded VMs. Time-of-day is a factor.
We have seen pause times of 60 seconds and longer in code that isn't even doing any OS calls.

While we made some changes so that commonly recurring github specific errors went away, it wasn't a deep understanding of the root cause. It is impossible to reproduce and I only have so many GH credits per month to run 20+ jobs in parallel repeatedly.

The failing tests involve networking but the tests are deterministic start-server, test, stop-server - so how a "no bytes" message can come back when the test hasn't run the code to shutdown the server is bizarre. It's not impossible that the issue is in the JDK/OS (runs out of some resource due to delayed cleanup - some of TCP cleanup is async and can build up underload) and/or the way JUnit fires the post-test clean-up. The situation of many servers start-stop is not realistic user scanario.

Ignore the failure - rerun the job. (PS I did - it passed.)

There has been a recent (in the last few days) recurrence of this after a quiet spell.

PS - there are other failure modes about at the moment, like GH actions failing to start and just saying "error" with no other output.

@arne-bdt arne-bdt marked this pull request as ready for review May 23, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integration tests for JenaSystem#init
2 participants