Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Edge::getWebViewWrapper deadlock threat #1466 #1966

Conversation

amartya4256
Copy link
Contributor

@amartya4256 amartya4256 commented Mar 31, 2025

This PR makes sure while calling Edge::getWebViewWrapper with waitForPendingWebviewTasksToFinish set to true that the method also checks for the completion of the webViewWrapperFuture before calling a join() on it, which can some time lead to deadlock as we allow forced exceptional completion of lastWebViewTask on timeout.

contributes to #1466

Copy link
Contributor

github-actions bot commented Mar 31, 2025

Test Results

   545 files  ±0     545 suites  ±0   30m 40s ⏱️ + 2m 11s
 4 367 tests ±0   4 355 ✅ ±0   12 💤 ±0  0 ❌ ±0 
16 616 runs  ±0  16 512 ✅ ±0  104 💤 ±0  0 ❌ ±0 

Results for commit 5eb3ba2. ± Comparison against base commit 72695f8.

♻️ This comment has been updated with latest results.

@amartya4256
Copy link
Contributor Author

Ran the failing tests locally - it runs fine with the proposed changes.

@amartya4256 amartya4256 force-pushed the amartay4256/fix_edge_getWebViewWrapper_deadlock branch from 0657aba to 5c1e180 Compare March 31, 2025 09:17
@HeikoKlare HeikoKlare force-pushed the amartay4256/fix_edge_getWebViewWrapper_deadlock branch from 5c1e180 to 3302644 Compare March 31, 2025 09:26
Copy link
Contributor

@HeikoKlare HeikoKlare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please sketch a flow that led to a deadlock before and is prevented by this change? I have to admit that though we quickly talked about it, I cannot recap the scenario.

There are also some mistaked in the commit message (in particular waitForLastPEndingLast is no used qualifier`). Could you please fix them?

@amartya4256
Copy link
Contributor Author

amartya4256 commented Mar 31, 2025

[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.581 D2279630 #2149048 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" [TestRunner @ 2025-03-14T22:36:05.5804878] almost reached timeout of 40M
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.588 D2279637 #000007 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" vi.test.runtime.TimeoutTimer$ThreadDump: [TestRunner] thread dump for thread: main
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.588 D2279637 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at java.base@21.0.3/jdk.internal.misc.Unsafe.park(Native Method)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.589 D2279638 #000001 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at java.base@21.0.3/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.589 D2279638 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at java.base@21.0.3/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.589 D2279638 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at java.base@21.0.3/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.589 D2279638 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at java.base@21.0.3/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.589 D2279638 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at java.base@21.0.3/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.590 D2279639 #000001 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at java.base@21.0.3/java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2117)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.590 D2279639 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.swt.browser.Edge$WebViewProvider.getWebViewWrapper(Edge.java:415)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.590 D2279639 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.swt.browser.Edge$WebViewProvider.getWebView(Edge.java:427)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.590 D2279639 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.swt.browser.Edge.getUrl(Edge.java:954)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.590 D2279639 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.swt.browser.Browser.getUrl(Browser.java:771)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.591 D2279640 #000001 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.ui.internal.intro.impl.presentations.BrowserIntroPartImplementation.saveCurrentPage(BrowserIntroPartImplementation.java:645)
[2025-03-14T21:36:08.996Z]      [java] [2025/03/14 22:36:05.591 D2279640 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.ui.internal.intro.impl.model.AbstractIntroPartImplementation.saveState(AbstractIntroPartImplementation.java:302)
[2025-03-14T21:36:08.997Z]      [java] [2025/03/14 22:36:05.591 D2279640 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.ui.internal.intro.impl.model.IntroPartPresentation.saveState(IntroPartPresentation.java:427)
[2025-03-14T21:36:08.997Z]      [java] [2025/03/14 22:36:05.591 D2279640 #000000 Heap:10240 2856 2005 Meta:733] "TestRunner timer,5,main" 	at org.eclipse.ui.intro.config.CustomizableIntroPart.saveState(CustomizableIntroPart.java:405)

This was the stacktrace of the issue and as you can see, it gets stuck at org.eclipse.swt.browser.Edge$WebViewProvider.getWebViewWrapper(Edge.java:415), which is:

        private WebViewWrapper getWebViewWrapper(boolean waitForPendingWebviewTasksToFinish) {
		if(waitForPendingWebviewTasksToFinish) {
			processOSMessagesUntil(lastWebViewTask::isDone, exception -> {
				lastWebViewTask.completeExceptionally(exception);
				throw exception;
			}, browser.getDisplay());
		}
		return webViewWrapperFuture.join();
	}

at the line return webViewWrapperFuture.join();

while we know the future chain is something like webViewWrapperFuture ->some previously scheduled future -> lastWebViewTask, in this method we only check for lastWebViewTask to complete. Which should complete only after webViewWrapperFuture completes. However, in the block:

                       processOSMessagesUntil(lastWebViewTask::isDone, exception -> {
				lastWebViewTask.completeExceptionally(exception);
				throw exception;
			}, browser.getDisplay());

We see it can be completed exceptionally on timeout irrespectve of if webViewWrapperFuture has completed. In this case, we are prone to a deadlock or atleast with my investigation that was the only reasoning I could do for this stacktrace. Also if waitForPendingWebviewTasksToFinish is set to false then it directly tries to access the webViewWrapperFuture which can lead to a deadlock as well.

Conclusion: We should definitely wait for webViewWrapperFuture to complete before performing anything.

This commit makes sure while calling Edge::getWebViewWrapper with
waitForPendingWebviewTasksToFinish set to true that the method also checks for the
completion of the webViewWrapperFuture before calling a join() on it,
which can some time lead to deadlock as we allow forced execptional
completion of lastWebViewTask on timeout.

contributes to eclipse-platform#1466
@amartya4256 amartya4256 force-pushed the amartay4256/fix_edge_getWebViewWrapper_deadlock branch from 3302644 to 5eb3ba2 Compare March 31, 2025 12:07
Copy link
Contributor

@fedejeanne fedejeanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I see no risk with merging this PR.

Follow-up work

We should find a way to make it known that calling webViewWrapperFuture.join() is a bad idea and that one should use getWebViewWrapper() instead. But that's for a follow-up PR since it's an internal class and not part of the API (only us, SWT developers, are affected by this).

@fedejeanne
Copy link
Contributor

Can you please sketch a flow that led to a deadlock before and is prevented by this change? I have to admit that though we quickly talked about it, I cannot recap the scenario.

@HeikoKlare has this been addressed by #1966 (comment) ?

@fedejeanne
Copy link
Contributor

Can you please sketch a flow that led to a deadlock before and is prevented by this change? I have to admit that though we quickly talked about it, I cannot recap the scenario.

@HeikoKlare has this been addressed by #1966 (comment) ?

I just say your answer from yesterday in Slack. Merging.

@fedejeanne fedejeanne merged commit 74c39bf into eclipse-platform:master Apr 2, 2025
15 checks passed
@fedejeanne fedejeanne deleted the amartay4256/fix_edge_getWebViewWrapper_deadlock branch April 2, 2025 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Edge browser sporadically blocks UI thread
3 participants