Skip to content

Implement @prefetch() #10295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 11, 2021
Merged

Implement @prefetch() #10295

merged 4 commits into from
Dec 11, 2021

Conversation

ifreund
Copy link
Member

@ifreund ifreund commented Dec 7, 2021

Implements #3600

I've made two small style changes to the proposed interface:

  1. I went with std.builtin.PrefetchOptions instead of std.builtin.Prefetch to match CallOptions, ExportOptions, etc.
  2. I called the third field cache instead of type as I think it is more clear. However we may actually want to remove this option outright for the time being...

It seems that passing 0 as the third argument to llvm's prefetch builtin to indicate that the prefetch should be performed on the instruction cache causes LLVM to error out with something like this:

LLVM ERROR: Cannot select: 0x562197ba83f8: ch = Prefetch<(load (s8) from %ir.9)> 0x56219b37a0b8, 0x562197ba8598, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, foo.zig:28:9
  0x562197ba8598: i64 = add 0x562197ba87a0, 0x562197ba8600, foo.zig:28:26
    0x562197ba87a0: i64,ch = CopyFromReg 0x56219b37a0b8, Register:i64 %5, foo.zig:28:26
      0x562197ba8870: i64 = Register %5
    0x562197ba8600: i64 = shl 0x562197ba8738, Constant:i8<2>, foo.zig:28:26
      0x562197ba8738: i64 = srl 0x562197ba8390, Constant:i8<2>, foo.zig:28:32
        0x562197ba8390: i64,ch = CopyFromReg 0x56219b37a0b8, Register:i64 %1, foo.zig:25:29
          0x562197ba8460: i64 = Register %1
        0x562197ba8328: i8 = Constant<2>
      0x562197ba8328: i8 = Constant<2>
  0x562197ba82c0: i32 = Constant<0>
  0x562197ba82c0: i32 = Constant<0>
  0x562197ba82c0: i32 = Constant<0>
In function: binary_search
zsh: abort      zig build-lib /tmp/foo.zig -OReleaseFast

I'm not sure what exactly this error means, but it seems to indicate lack of support for prefetch using the instruction cache. Note that gcc and clang don't expose this as an option in their __builtin_prefetch(): they only support prefetching for the data cache. I therefore propose that we remove the cache field from our prefetch options for now, we can always add it back in the future if llvm gains better support for it and someone has a use-case.

@ifreund
Copy link
Member Author

ifreund commented Dec 7, 2021

It seems that LLVM is able to compile the prefetch intrinsic targeting the instruction cache for some architectures, including aarch64 and powerpc64. The fact that it fails to do so when targeting x86 seems to be an LLVM bug which was reported in 2014: https://bugs.llvm.org/show_bug.cgi?id=21037

The plan for this PR is to keep the cache option and skip emitting the llvm.prefetch intrinsic when targeting x86 and any other targets that LLVM happens to fail on.

@ifreund ifreund force-pushed the prefetch branch 2 times, most recently from 07ee8e9 to 61e266e Compare December 10, 2021 22:06
This enables automatic formatting for a significant amount of code that
currently doesn't deviate from the standard zig fmt enforced style.
@ifreund ifreund force-pushed the prefetch branch 2 times, most recently from 1998ec1 to be02ee7 Compare December 10, 2021 22:28
@ifreund ifreund marked this pull request as ready for review December 10, 2021 22:28
@ifreund
Copy link
Member Author

ifreund commented Dec 10, 2021

This should be ready for review now. I've added a behavior test and documentation for @prefetch() and implemented a workaround for the LLVM bug described above.

@ifreund
Copy link
Member Author

ifreund commented Dec 10, 2021

Welp, our wonderful CI matrix and the exhaustive behavior test I just added have revealed another case of this LLVM bug if a write prefetch is made targeting the instruction cache for the following target and mcpu:

-target arm-linux-none -mcpu=generic+has_v7+mp

I'll add this case to our workaround...

@andrewrk andrewrk merged commit 97c0373 into ziglang:master Dec 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants