-
Notifications
You must be signed in to change notification settings - Fork 6
Fix ARM64 Virtual Thread currentCarrierThread Intrinsic Bug #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: macarte/rebased-win-arrch64-fixes
Are you sure you want to change the base?
Changes from 11 commits
d54a0cf
b8d5fca
9ac65c9
73d39aa
e898b3e
378cdf1
01c7ef5
368d041
4acd3a3
dba3bac
80ff85a
2baed3c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3560,6 +3560,14 @@ const char* GraphBuilder::check_can_parse(ciMethod* callee) const { | |
| const char* GraphBuilder::should_not_inline(ciMethod* callee) const { | ||
| if ( compilation()->directive()->should_not_inline(callee)) return "disallowed by CompileCommand"; | ||
| if ( callee->dont_inline()) return "don't inline by annotation"; | ||
|
|
||
| // Don't inline a method that changes Thread.currentThread() except | ||
| // into another method that is annotated @ChangesCurrentThread. | ||
| if (callee->changes_current_thread() | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that we're fixing atomics handling on Windows AArch64, won't we need additional justification for why we're changing code that other platforms are using given that they haven't run into any of these concurrency issues?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [CoPilot] This should not be guarded with an architecture check. Here's why: The problem this solves is a compiler correctness issue, not a hardware barrier issue. If C1 inlines a @ChangesCurrentThread method (like Continuation.yield() or carrier-thread switching code) into a caller that isn't annotated @ChangesCurrentThread, the compiler can cache the result of Thread.currentCarrierThread() across the inlined call — observing the old carrier thread after the virtual thread has migrated to a new one. This is broken on all architectures, not just ARM64:
The check is also very low cost — it only runs during C1 inlining decisions, not at runtime. And @ChangesCurrentThread is used on a tiny number of methods, so this almost never triggers. Bottom line: This is a correct-by-construction inlining guard that prevents a real compiler bug on all platforms. No architecture guard needed. The existing code is right as-is. |
||
| && !compilation()->method()->changes_current_thread()) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For this specific change, for example, wouldn't it be a bug for an inlined method that changes threads to not read the correct state given that all its registers are correct and the load/acquire semantics have been fixed?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [CoPilot] Good question. The concern is specifically about compiler value caching, not memory ordering. When C1 inlines a @ChangesCurrentThread method into a non-@ChangesCurrentThread caller, the compiler can CSE (common subexpression elimination) or hoist the currentCarrierThread intrinsic across the carrier-change boundary. The result: the caller reuses a register-held value from before the yield/mount/unmount and never re-reads it. The MO_ACQUIRE fix addresses a different problem — it ensures that when you do load currentCarrierThread, subsequent dependent loads (e.g., Thread.cont) see consistent data. But it can't force a re-load that the compiler optimized away entirely. Consider this concrete scenario after inlining: The MO_ACQUIRE on the LDAR is irrelevant here because there is no second LDAR — C1 reused R0. And the "correct registers" after thaw are correct in the sense that they're faithfully restored from freeze — but they contain the pre-yield values, which is exactly the stale data. The inlining guard ensures @ChangesCurrentThread methods stay as real calls, which forces C1 to treat currentCarrierThread as potentially invalidated across the call boundary and re-read it. So: MO_ACQUIRE = correct ordering on loads that happen. Inlining guard = ensuring loads actually happen. They're complementary. |
||
| return "method changes current thread"; | ||
| } | ||
|
|
||
| return nullptr; | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using the same style for these flags as the rest of the make files
-volatile:ms