Browser tests: reduce timeout for instantiation to reasonable value #1706

HeikoKlare · 2025-01-06T13:49:30Z

The timeout for browser instantiation in the browser tests has been increased to avoid flaky tests (via #1677). However, browser instantiation taking more than 10 seconds (the previous timeout value) is an indicator for something being wrong with the instantiation. There is no reason why instantiation may take that long during test execution but the same issue should never arise in productive use. Thus, such a case should be captured by tests and, if necessary, be fixed productively.

In order to do so, this change reduces the test timeout value again to a more reasonable value.

Related to #1676

Also see #1677 (comment)

github-actions · 2025-01-06T13:57:55Z

Test Results

501 files - 1 501 suites - 1 10m 43s ⏱️ - 5m 9s
4 297 tests - 37 4 285 ✅ - 35 11 💤 - 3 1 ❌ +1
16 538 runs - 37 16 432 ✅ - 34 105 💤 - 4 1 ❌ +1

For more details on these failures, see this check.

Results for commit 9313572. ± Comparison against base commit cc07c36.

This pull request removes 37 tests.

org.eclipse.swt.graphics.ImageWin32Tests ‑ testImageDataForDifferentFractionalZoomsShouldBeDifferent
org.eclipse.swt.graphics.ImageWin32Tests ‑ testImageShouldHaveDimesionAsPerZoomLevel
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testByteArrayTransfer
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testFileTransfer
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testHtmlTransfer
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testImageTransfer_fromCopiedImage
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testImageTransfer_fromImage
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testImageTransfer_fromImageData
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testImageTransfer_fromImageDataFromImage
org.eclipse.swt.tests.win32.Test_org_eclipse_swt_dnd_DND ‑ testRtfTransfer
…

♻️ This comment has been updated with latest results.

HeikoKlare · 2025-01-19T15:27:24Z

Builds sporadically show timeouts, e.g.:

Error:  Tests run: 380, Failures: 1, Errors: 0, Skipped: 12, Time elapsed: 193.3 s <<< FAILURE! -- in org.eclipse.swt.tests.junit.Test_org_eclipse_swt_browser_Browser
Error:  org.eclipse.swt.tests.junit.Test_org_eclipse_swt_browser_Browser.test_BrowserFunction_callback[browser flags: 0] -- Time elapsed: 66.79 s <<< FAILURE!
java.lang.AssertionError: creating browser took too long: 65267ms
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.assertTrue(Assert.java:42)
	at org.eclipse.swt.tests.junit.Test_org_eclipse_swt_browser_Browser.createBrowser(Test_org_eclipse_swt_browser_Browser.java:312)
	at org.eclipse.swt.tests.junit.Test_org_eclipse_swt_browser_Browser.setUp(Test_org_eclipse_swt_browser_Browser.java:185)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)

Might be related to eclipse-platform/eclipse.platform.ui#2734

fedejeanne · 2025-01-30T14:18:39Z

I rebased on master and solved the conflicts. Let's see if there are still (too many) test failures.

@HeikoKlare according to your comment (#1752 (comment)) this PR might need to be postponed until #1752 is merged, correct?

HeikoKlare · 2025-01-30T15:01:55Z

This PR should actually not need any other PR at all. The failing tests indicate that there is a bug in either the browser implementation or the test setup that are not aware of yet, so something we need to find out independent from any other PR (as I already mentioned in #1677 (comment)).

fedejeanne · 2025-01-31T07:51:12Z

What I mean is that "if we merge this PR after the ones that are already open then it is to be expected that there are no/less test failures due to timeouts" . But I see that you want to use this PR as a basis to check if the other PRs actually help (reduce/get rid of the test failures because of timeouts and freezes).

So, shall we merge this one then?

HeikoKlare · 2025-01-31T07:56:43Z

"if we merge this PR after the ones that are already open then it is to be expected that there are no/less test failures due to timeouts"

As we know, that's not the case. We still need to find the issue for the browser instantiation in tests taking so long. All the fixes we currently make regarding timeouts are only supposed to avoid that the actual errors (which we may not know yet) do not lead to a blocking UI. They do not properly solve the underlying issue.

So, shall we merge this one then?

I am not sure if people will be happy about that. Most Windows builds will fail with test failures then. So I wouldn't do that but rather find out the reasons for the failure first.

HeikoKlare · 2025-02-01T19:20:46Z

Note: the thread dump written after 60 seconds of test execution time show that the usually the main thread is stuck in display,sleep() / OS.WaitMessage() while waiting for browser instantiation:

[2025-02-01 17:15:07 +0000] org.eclipse.swt.tests.junit.Test_org_eclipse_swt_browser_Browser.test_BrowserFunction_multiprocess[browser flags: 0]() ran for more than 60 seconds
totalMemory:             146800640
freeMemory (before GC):   96621696
freeMemory (after GC):    58666000
"main" prio=5 Id=1 RUNNABLE (in native)
	at app//org.eclipse.swt.internal.win32.OS.WaitMessage(Native Method)
	at app//org.eclipse.swt.widgets.Display.sleep(Display.java:4803)
	at app//org.eclipse.swt.browser.Edge.processOSMessagesUntil(Edge.java:471)
	at app//org.eclipse.swt.browser.Edge$WebViewProvider.getWebViewWrapper(Edge.java:370)
	at app//org.eclipse.swt.browser.Edge$WebViewProvider.getWebView(Edge.java:381)
	at app//org.eclipse.swt.browser.Edge.getUrl(Edge.java:894)
	at app//org.eclipse.swt.browser.Browser.getUrl(Browser.java:771)
	at app//org.eclipse.swt.tests.junit.Test_org_eclipse_swt_browser_Browser.createBrowser(Test_org_eclipse_swt_browser_Browser.java:308)

One might think that there is some wake message, but the actual issue is that the OS message triggering the WebView instantiation callback did actually not arrive. When adding an explicit timed, repeated awakening, the display still goes into sleep repeatedly since no message arrived from the OS so far. In addition, adding debug output in the loop waiting for the OS message shows that the loop is spinned quite often.

Further finding: Once might also think that there is some message scheduled asynchronously on the display, which is not processed by the processOSMessagesUntil() (as it only processed OS messages) but may need to be processed (for whatever reason) to proceed with the processing the queue. In that case, whoever, display.sleep() would always returning immediately, as the display's synchronizer queue is not empty. Debug output also shows that this is usually not the case: the display properly goes to sleep in the scenarios where browser instantiation takes long, so there is actually nothing in the event queue (neither from the OS nor scheduled at the display by the application).

So the underlying issue is that the OS does not call the WebView instantiation callback for a long time (> 60 seconds). It is also interesting that usually after 65 seconds the callback is finally executed.

fedejeanne · 2025-02-04T08:57:10Z

I've also been looking at it and all I came up with is:

It's true that the problem is the callback is simply being called after ~65s (the delay I've seen is pretty consistent, always between 65.2s and 65.7s).
I've seen the callback take the ~65s delay "only" 1-2 times per CI-run (and if it does it twice, the 2nd time is usually in the next 30 seconds after the 1st one).

Regarding point nr. 1, I used this commit to log the request and the callback and I saw this output:

1st delay: 65.237.957.400 ns

1356349440500 - ICoreWebView2Environment.CreateCoreWebView2Controller()
...
1421587397900 - Edge.createControllerInitializationCallback()

**2nd delay: 65.262.087.700 ns **

1423716947100 - ICoreWebView2Environment.CreateCoreWebView2Controller()
...
1488979034800 - Edge.createControllerInitializationCallback()

A quick internet search revealed that Instantiating Edge can be slow, but what I still don't get is:

Why is it sometimes slow and not all the time ?
Why is the delay consistently something ~65s ?

@HeikoKlare let's discuss how to proceed with this PR in today's daily.

fedejeanne · 2025-02-05T16:02:51Z

FYI there is a 2nd commit in this PR, one is from #1797

HeikoKlare · 2025-02-05T16:04:50Z

Yes, that was intended for testing purposes. We still see issues with WebView intialization sometimes taking long, so this is not ready for being merged anyway. I thus changed it back to "draft".

The timeout for browser instantiation in the browser tests has been increased to avoid flaky tests. However, browser instantiation taking that long is an indicator for something being wrong with the instantiation. There is no reason why instantiation may take that long during test execution but the same issue should never arise in productive use, thus such a case should be captured by tests and, if necessary, be fixed productively. In order to do so, this change reduces the test timeout value again to a more reasonable value. Since the first Edge browser instantiation takes rather long at least in the GitHub actions test environment, that instantiation is isolated in an execution before the test class. Related to eclipse-platform#1676

HeikoKlare marked this pull request as ready for review January 6, 2025 13:58

HeikoKlare force-pushed the browser-test-initialization-timeout branch 5 times, most recently from a5eacf0 to 228c11e Compare January 14, 2025 14:15

HeikoKlare force-pushed the browser-test-initialization-timeout branch from 228c11e to 8cf3573 Compare January 14, 2025 18:58

HeikoKlare mentioned this pull request Jan 19, 2025

experimental rescaling option leads to application blocked eclipse-platform/eclipse.platform.ui#2734

Closed

HeikoKlare force-pushed the browser-test-initialization-timeout branch from 8cf3573 to 194a8a0 Compare January 19, 2025 15:28

HeikoKlare mentioned this pull request Jan 21, 2025

Application freezes due to using Edge vi-eclipse/Eclipse-Platform#187

Closed

HeikoKlare force-pushed the browser-test-initialization-timeout branch 12 times, most recently from 8afbe01 to b74ebd7 Compare January 30, 2025 12:16

fedejeanne force-pushed the browser-test-initialization-timeout branch from b74ebd7 to eda5bd4 Compare January 30, 2025 14:15

HeikoKlare force-pushed the browser-test-initialization-timeout branch from eda5bd4 to 7a661ab Compare January 30, 2025 15:52

HeikoKlare force-pushed the browser-test-initialization-timeout branch 5 times, most recently from fd49490 to 47582ad Compare February 1, 2025 10:56

HeikoKlare mentioned this pull request Feb 1, 2025

[Linux] Fix random test failures of browser tests #1523 #1788

Merged

HeikoKlare force-pushed the browser-test-initialization-timeout branch 5 times, most recently from 0222db7 to 57b8f65 Compare February 1, 2025 19:20

HeikoKlare force-pushed the browser-test-initialization-timeout branch 6 times, most recently from f6bb7be to c785da1 Compare February 3, 2025 20:35

HeikoKlare mentioned this pull request Feb 4, 2025

"Unspecified error Waiting for Edge operation to terminate timed out" #1801

Open

HeikoKlare force-pushed the browser-test-initialization-timeout branch from c785da1 to 8c91436 Compare February 5, 2025 14:01

HeikoKlare marked this pull request as draft February 5, 2025 16:04

HeikoKlare force-pushed the browser-test-initialization-timeout branch 3 times, most recently from 9bd1044 to 6843b3e Compare February 11, 2025 16:17

HeikoKlare added 2 commits February 13, 2025 16:25

Open shell in browser tests

9313572

HeikoKlare force-pushed the browser-test-initialization-timeout branch from 6843b3e to 9313572 Compare February 13, 2025 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Browser tests: reduce timeout for instantiation to reasonable value #1706

Browser tests: reduce timeout for instantiation to reasonable value #1706

HeikoKlare commented Jan 6, 2025

github-actions bot commented Jan 6, 2025 •

edited

Loading

HeikoKlare commented Jan 19, 2025

fedejeanne commented Jan 30, 2025

HeikoKlare commented Jan 30, 2025

fedejeanne commented Jan 31, 2025

HeikoKlare commented Jan 31, 2025

HeikoKlare commented Feb 1, 2025

fedejeanne commented Feb 4, 2025 •

edited

Loading

fedejeanne commented Feb 5, 2025

HeikoKlare commented Feb 5, 2025

Browser tests: reduce timeout for instantiation to reasonable value #1706

Are you sure you want to change the base?

Browser tests: reduce timeout for instantiation to reasonable value #1706

Conversation

HeikoKlare commented Jan 6, 2025

github-actions bot commented Jan 6, 2025 • edited Loading

Test Results

HeikoKlare commented Jan 19, 2025

fedejeanne commented Jan 30, 2025

HeikoKlare commented Jan 30, 2025

fedejeanne commented Jan 31, 2025

HeikoKlare commented Jan 31, 2025

HeikoKlare commented Feb 1, 2025

fedejeanne commented Feb 4, 2025 • edited Loading

fedejeanne commented Feb 5, 2025

HeikoKlare commented Feb 5, 2025

github-actions bot commented Jan 6, 2025 •

edited

Loading

fedejeanne commented Feb 4, 2025 •

edited

Loading