Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aarch64] Is a bug for selecting register? #121241

Closed
Zhenhang1213 opened this issue Dec 28, 2024 · 4 comments
Closed

[aarch64] Is a bug for selecting register? #121241

Zhenhang1213 opened this issue Dec 28, 2024 · 4 comments
Labels
backend:AArch64 question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@Zhenhang1213
Copy link
Contributor

demo:
https://godbolt.org/z/E5zqec1xj

I think this is wrong, Instead of w8, x8 should be selected, like gcc. And I find this ir is right

@llvmbot
Copy link
Member

llvmbot commented Dec 28, 2024

@llvm/issue-subscribers-backend-aarch64

Author: Austin (Zhenhang1213)

demo: https://godbolt.org/z/E5zqec1xj

I think this is wrong, Instead of w8, x8 should be selected, like gcc. And I find this ir is right

@pinskia
Copy link

pinskia commented Dec 28, 2024

The difference is:
GCC produces:

        mov     x0, 8     // tmp101,
        movk    x0, 0xff00, lsl 16      // tmp101,,

While LLVM produces:

        mov     w8, #8
        movk    w8, #65280, lsl #16

In the end result is the exact same result just with slightly different instructions.
Both will produce 0xFF000000ull+8 in the register. just clang will use the w view of the register while GCC uses the x view the register. Both movk will execute the same for all cores I know of (the "extra" zero-extend for the LLVM is free in all cores I Know of, including the older in-order cores). The cores which do have the mov/movk optimizations handle both cases too. So nothing is wrong with either implementation of the constant generation here as far as I can tell.

@Zhenhang1213
Copy link
Contributor Author

The difference is: GCC produces:

        mov     x0, 8     // tmp101,
        movk    x0, 0xff00, lsl 16      // tmp101,,

While LLVM produces:

        mov     w8, #8
        movk    w8, #65280, lsl #16

In the end result is the exact same result just with slightly different instructions. Both will produce 0xFF000000ull+8 in the register. just clang will use the w view of the register while GCC uses the x view the register. Both movk will execute the same for all cores I know of (the "extra" zero-extend for the LLVM is free in all cores I Know of, including the older in-order cores). The cores which do have the mov/movk optimizations handle both cases too. So nothing is wrong with either implementation of the constant generation here as far as I can tell.

ok, Do you mean that movk will also set the upper 32 bits in the remaining x register to zeros? and I make a new demo: https://godbolt.org/z/n1ceqccfj if it would set zeros, I think therre is no problem

testfunc:
        adrp    x8, g_SpiFlashCtr+80
        add     x8, x8, :lo12:g_SpiFlashCtr+80
        ldr     x9, [x8]
        cbz     x9, .LBB1_3
        ldr     w8, [x8, #8]
        cbnz    w8, .LBB1_3
        ret
.LBB1_3:
        mov     w8, #8
        mov     w10, #8
        movk    w8, #65280, lsl #16
        mov     w9, #4627
        movk    w10, #255, lsl #16
        str     w9, [x8]
        str     w9, [x10]

@pinskia
Copy link

pinskia commented Dec 28, 2024

        mov     w8, #8 // x8 = 0x8
        mov     w10, #8 // x10 = 0x8
//0xff00 == 65280
        movk    w8, #65280, lsl #16 // x8 = (x8 &~(0xFFFF<<16) | 65280ul<<16)&0xFFFFFFFF (the &0xFFFFFFFF is due to using the w view). 
// so x8 = 0xff00ul <<16 | 0x8  aka 0xFF000000+8
        mov     w9, #4627
        movk    w10, #255, lsl #16

GCC:

        mov     x1, 8 // x8 = 0x8
        mov     w0, 4627
        movk    x1, 0xff00, lsl 16 // x8 = (x8 &~(0xFFFF<<16) | 0xff00ul<<16)
// x8 = 0xff00<<16|8 aka 0xFF000000ul+8

The exactly the same result. Just slightly different instructions.
the movk is an insert in both cases and does not sign extend. Just the one with the w view will zero out the upper 32bits always. But in this case, it was already zero so it does not change anything.

@hstk30-hw hstk30-hw added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

5 participants