-
-
Notifications
You must be signed in to change notification settings - Fork 366
ref: Use RunLoop observer for watchdog detection #6237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
db41d1f
to
c7cfc40
Compare
❌ 2 Tests Failed:
View the top 3 failed test(s) by shortest run time
View the full list of 1 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
29cd58b
to
418cf15
Compare
Performance metrics 🚀
|
Revision | Plain | With Sentry | Diff |
---|---|---|---|
1fecbb8 | 1242.78 ms | 1265.40 ms | 22.62 ms |
d05d866 | 1211.78 ms | 1230.96 ms | 19.18 ms |
32e7197 | 1226.91 ms | 1245.48 ms | 18.56 ms |
0e9c5ae | 1226.10 ms | 1254.14 ms | 28.04 ms |
397b9c9 | 1230.23 ms | 1249.29 ms | 19.06 ms |
718c372 | 1220.09 ms | 1235.15 ms | 15.06 ms |
45482a6 | 1225.88 ms | 1254.27 ms | 28.39 ms |
f97a070 | 1218.88 ms | 1253.12 ms | 34.24 ms |
5846fc5 | 1223.25 ms | 1248.79 ms | 25.54 ms |
43597ba | 1214.88 ms | 1243.52 ms | 28.65 ms |
App size
Revision | Plain | With Sentry | Diff |
---|---|---|---|
1fecbb8 | 23.75 KiB | 969.28 KiB | 945.53 KiB |
d05d866 | 23.75 KiB | 878.60 KiB | 854.85 KiB |
32e7197 | 23.75 KiB | 866.69 KiB | 842.94 KiB |
0e9c5ae | 23.75 KiB | 969.29 KiB | 945.54 KiB |
397b9c9 | 23.75 KiB | 959.44 KiB | 935.70 KiB |
718c372 | 23.75 KiB | 920.65 KiB | 896.90 KiB |
45482a6 | 23.75 KiB | 919.91 KiB | 896.16 KiB |
f97a070 | 23.75 KiB | 858.68 KiB | 834.93 KiB |
5846fc5 | 23.75 KiB | 912.77 KiB | 889.02 KiB |
43597ba | 23.75 KiB | 880.32 KiB | 856.58 KiB |
Previous results on branch: hangTrackerAbstraction
Startup times
Revision | Plain | With Sentry | Diff |
---|---|---|---|
ac854e2 | 1220.81 ms | 1250.20 ms | 29.39 ms |
1e7ad41 | 1245.79 ms | 1280.67 ms | 34.88 ms |
547ce2a | 1226.90 ms | 1257.82 ms | 30.92 ms |
fee2eea | 1225.57 ms | 1243.82 ms | 18.25 ms |
App size
Revision | Plain | With Sentry | Diff |
---|---|---|---|
ac854e2 | 23.75 KiB | 995.72 KiB | 971.97 KiB |
1e7ad41 | 23.75 KiB | 994.33 KiB | 970.58 KiB |
547ce2a | 23.75 KiB | 995.83 KiB | 972.08 KiB |
fee2eea | 23.75 KiB | 996.53 KiB | 972.78 KiB |
d2065b3
to
ec5403e
Compare
ec5403e
to
915a740
Compare
func removeLateRunLoopObserver(id: UUID) { | ||
queue.async { [weak self] in | ||
guard let self else { return } | ||
lateRunLoop.removeValue(forKey: id) | ||
if lateRunLoop.isEmpty { | ||
DispatchQueue.main.async { [weak self] in | ||
if self?.finishedRunLoop.isEmpty ?? false { | ||
self?.stop() | ||
} | ||
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential bug: Inconsistent threading in observer management creates race conditions. addLateRunLoopObserver
uses async dispatch while addFinishedRunLoopObserver
is synchronous, leading to potential lifecycle issues.
-
Description: The
HangTracker
class exhibits inconsistent threading patterns for managing observers. TheaddLateRunLoopObserver
method uses a background queue and then dispatches back to the main queue, whileaddFinishedRunLoopObserver
modifies state directly on the main thread. This inconsistency, coupled with complex removal logic involving cross-queue state checks (e.g.,lateRunLoop.isEmpty
,finishedRunLoop.isEmpty
), creates race conditions. These races can lead to the runloop observer not being stopped when all observers are removed, causing resource leaks, or being stopped prematurely, which would disable hang detection. -
Suggested fix: Ensure all observer management operations are synchronized. Either perform all modifications to
lateRunLoop
andfinishedRunLoop
on the main thread directly, or consistently dispatch all add/remove operations to the privatequeue
. This will eliminate race conditions caused by cross-queue state checks and inconsistent execution contexts.
severity: 0.7, confidence: 0.9
Did we get this right? 👍 / 👎 to inform future reviews.
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Race Conditions in HangTracker's Thread Safety
Race conditions exist in HangTracker
due to inconsistent thread safety. add/removeFinishedRunLoopObserver
access finishedRunLoop
directly, which is unsafe as the dictionary is also accessed on the main thread. Additionally, the remove
methods' logic for stopping the tracker involves multiple asynchronous dispatches to check finishedRunLoop
and lateRunLoop
emptiness, creating a race that can lead to incorrect start/stop decisions.
This introduces a new
HangTracker
class that observers the run loop and notifies anything registered with it when the runloop is late. The idea is to eventually replace all uses of CADisplayLink with this. They are not exactly the same, display link works on vsync events and a runloop observer hooks directly into the runloop lifecycle.In this PR I use the new HangTracker in the watchdog integration. This seemed like a pretty small use of display links and was a good opportunity to have a first crack at removing them. I also don't think this change requires any kind of new opt-in options, since it's all internal to how the integration works.
The SDK default behavior before this change is to enable the watchdog integration, which means by default the SDK spins the CPU in the background constantly waking up to process these vsync events. Since we are in the final stretch of quality quarter I wanted to get rid of this behavior and replace it with a passive observer, like this.
A few interesting implementation details:
I also manually tested in the swift sample app by making sure when the debugger wasn't attached watchdog events still get sent when I force-quite the app with Xcode. Also verified that if I simulate a hang with the sample app and then force quit while the hang is in progress, watchdog events are not sent.
Closes #6252