Skip to content

rsqrt as builtin #19302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
expikr opened this issue Mar 14, 2024 · 6 comments
Closed

rsqrt as builtin #19302

expikr opened this issue Mar 14, 2024 · 6 comments

Comments

@expikr
Copy link
Contributor

expikr commented Mar 14, 2024

for the hardware instructions where available

@nektro
Copy link
Contributor

nektro commented Mar 14, 2024

do you have docs on what this function is?

@nektro
Copy link
Contributor

nektro commented Mar 14, 2024

1 / @sqrt(x) should compile to what you seek

@expikr
Copy link
Contributor Author

expikr commented Mar 14, 2024

does it always, for vectors as well?

Code

export fn rsqrt(num: f32) f32 {
    return 1/@sqrt(num);
}

Godbolt default

# Compilation provided by Compiler Explorer at https://godbolt.org/
.LCPI0_0:
        .long   0x3f800000
rsqrt:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 4
        vmovaps xmm1, xmm0
        vmovss  dword ptr [rbp - 4], xmm1
        vsqrtss xmm1, xmm0, xmm1
        vmovss  xmm0, dword ptr [rip + .LCPI0_0]
        vdivss  xmm0, xmm0, xmm1
        add     rsp, 4
        pop     rbp
        ret

Godbolt -O ReleaseFast

# Compilation provided by Compiler Explorer at https://godbolt.org/
.LCPI0_0:
        .long   0x3f800000
rsqrt:
        vsqrtss xmm0, xmm0, xmm0
        vmovss  xmm1, dword ptr [rip + .LCPI0_0]
        vdivss  xmm0, xmm1, xmm0
        ret

@nektro
Copy link
Contributor

nektro commented Mar 14, 2024

at a glance it looks like it should be using (V)RSQRTPS instead

@castholm
Copy link
Contributor

Somewhat related: #767
At a quick glance there doesn't appear to be an LLVM intrinsic for rsqrt, but I don't think that's necessarily an argument against adding @rsqrt and I can such a builtin being useful in certain cases.

1 / @sqrt(x) should compile to what you seek

Unless @setFloatMode(.optimized) or equivalent compiler flags are in effect I would hope that 1 / @sqrt(x) is never compiled to reciprocal square root, for the same reasons it is undesired for a * b + c to silently compile to fused multiply-adds, divisions to reciprocals, etc. that change the semantics of the expression.

@expikr
Copy link
Contributor Author

expikr commented Mar 29, 2024

Continue tracking in #767 (comment)

@expikr expikr closed this as not planned Won't fix, can't repro, duplicate, stale Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants