Skip to content

Add macOS ARM64 deferred stack tracing support for crossplay#827

Open
jimhuds wants to merge 1 commit intorwmt:devfrom
jimhuds:arm64-stack-tracing
Open

Add macOS ARM64 deferred stack tracing support for crossplay#827
jimhuds wants to merge 1 commit intorwmt:devfrom
jimhuds:arm64-stack-tracing

Conversation

@jimhuds
Copy link

@jimhuds jimhuds commented Mar 8, 2026

Summary

  • Add macOS ARM64 (Apple Silicon) deferred stack tracing support for crossplay.
  • Implement an ARM64 FP-chain walker (X29) alongside the existing x64 RBP+LMF path.
  • Make trace comparison architecture-independent to prevent false desync positives between ARM64 and x64 clients.
  • Keep existing x64 behavior unchanged.

Approach

On macOS ARM64, stack walking follows the X29 frame-pointer chain rather than the x64 RBP+LMF path. The main cross-platform issue was the trace comparison hash: trace-derived values such as stack depth and method hashes vary between architectures, so the comparison hash now uses thingId + tick + rngState instead, which are architecture-independent.

Limitations

  • macOS ARM64 only (matching RimWorld's current ARM64 support matrix).
  • ARM64 tracing currently stops at unmanaged frames rather than following x64's LMF chain, so traces can be shorter.
  • The architecture-independent comparison hash is slightly less sensitive to same-window non-RNG-only divergence than the old architecture-specific trace hash. Primary desync detection through map/world/command RNG state checks is unchanged.

Testing

Tested over ~2 weeks of regular gameplay on Apple Silicon macOS (M4 Pro, macOS 26.3.1) in crossplay with a Windows x64 client. This eliminated the cross-architecture trace-hash false positives we were hitting and sessions were stable throughout.

@jimhuds jimhuds marked this pull request as ready for review March 8, 2026 16:43
Add ARM64 frame-pointer-based stack walking for desync tracing, enabling
Mac ARM64 users to play with x64 Windows/Linux players.

ARM64 implementation:
- GetFp() for frame pointer detection (equivalent to x64 GetRbp)
- TraceImplArm64() using FP chain walking (simpler than x64 since ARM64
  ABI always uses frame pointers)
- InitArm64Offset() with ARM64 instruction pattern detection (STR/STUR
  with X29 base register)
- LmfPtr = -1 signals ARM64 FP-only mode, all ARM64 code paths gated
  behind NativeArch.ARM64 checks (inert on x64)

Cross-platform desync detection fix:
- The trace comparison hash previously included architecture-dependent
  values (hashIn, depth) which differ between ARM64 FP-chain walking
  and x64 RBP+LMF walking, causing false desync detection
- Comparison hash now uses only rng state (architecture-independent)
- Full hash including trace info is still stored for diagnostic display

Tested with ~2 weeks of ARM64 Mac <-> x64 Windows crossplay.
@jimhuds jimhuds force-pushed the arm64-stack-tracing branch from e1dce17 to 47cc7c2 Compare March 8, 2026 17:11
@notfood notfood added the enhancement New feature or request. label Mar 9, 2026
@notfood notfood moved this to In review in 1.6 and Odyssey Mar 9, 2026
Copy link
Contributor

@mibac138 mibac138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to see that you are working on it, having native support is quite important. While the general direction looks good, I have a few questions about some details

return;

// ARM64 macOS doesn't use LMF - signal FP-only mode to DeferredStackTracingImpl
if (CurrentArch == NativeArch.ARM64 && os == NativeOS.OSX)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular limitation of the current implementation that makes it only support MacOS? Or is it just that it's untested on other OSes?

// implementations, causing false desync detection in cross-platform play.
// thingId and tick recover some non-RNG divergence sensitivity; rngState covers
// the common case.
var comparisonHash = Gen.HashCombineInt(item.thingId, item.tick, (int)(item.rngState >> 32), (int)item.rngState);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this is sufficient. Admittedly, I'm not sure how much desyncs are caused by trace mismatches, but I'm wary this change can delay or mask desyncs and make it harder to diagnose issues. Why would the traces differ? Is it about not following native frames or is there anything else?

int depth = 0;
int index = 0;

while (fp != 0 && depth < traceIn.Length + skipFrames + 100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning behind this condition depth < traceIn.Length + skipFrames + 100?

ref var info = ref hashTable.GetOrCreateAddrInfo(ret);
if (info.addr == 0) UpdateNewElementArm64(ref info, ret);

// Stop at unmanaged frames - no LMF chain walking on ARM64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there no LMF chain walking? Does Unity not support it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request.

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

3 participants