-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Winch: Implement rem for aarch64 #9781
Winch: Implement rem for aarch64 #9781
Conversation
winch/codegen/src/isa/aarch64/asm.rs
Outdated
// we therefore sign-extend the operand. | ||
// see: https://github.com/bytecodealliance/wasmtime/issues/9766 | ||
// | ||
// FIXME: maybe this is only necessary for signed remainder? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saulecabrera I'm not sure about that, but if we are performing an unsigned remainder operation, then do we really care about sign extending? Maybe it's not necessary for division either? Is there a scenario where we can't just blindly consider the w
register as a x
register?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same reasonning applies to div, I think. We could shave a few instructions in the case of unsigned division too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, nothing is stopping us from passing a w
register here, however given that we're dealing with the 64-bit instruction variant, it will take into account the full 64-bits not just the lower 32. In case that the high 32 bits are not zeroed out, this could lead to unexpected behavior. So we need to extend in both cases.
Subscribe to Label Action
This issue or pull request has been labeled: "winch"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Left some comments inline.
As a general comment, given that the same invariant as division apply here, could we modify the existing div_rrr
so that it can be re-used from rem_rrr
?
winch/codegen/src/isa/aarch64/asm.rs
Outdated
// we therefore sign-extend the operand. | ||
// see: https://github.com/bytecodealliance/wasmtime/issues/9766 | ||
// | ||
// FIXME: maybe this is only necessary for signed remainder? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, nothing is stopping us from passing a w
register here, however given that we're dealing with the 64-bit instruction variant, it will take into account the full 64-bits not just the lower 32. In case that the high 32 bits are not zeroed out, this could lead to unexpected behavior. So we need to extend in both cases.
winch/codegen/src/isa/aarch64/asm.rs
Outdated
self.emit(Inst::Extend { | ||
rd: writable!(divisor.into()), | ||
rn: divisor.into(), | ||
signed: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
winch/codegen/src/isa/aarch64/asm.rs
Outdated
size: OperandSize, | ||
) { | ||
// Check for division by 0 | ||
self.emit(Inst::TrapIf { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use self.trapz
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the same invariants as signed division apply here, right? i.e., we must still check for overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I was expecting it to be the case, but as it turn out, it doesn't seem necessary; the following snippet
(module
(func (param i32 i32) (result i32)
local.get 0
local.get 1
i32.rem_u
)
)
is compiled by cranelift to:
0000000000000000 <wasm[0]::function[0]>:
0: d503235f pacibz
4: a9bf7bfd stp x29, x30, [sp, #-16]!
8: 910003fd mov x29, sp
c: 2a0403e3 mov w3, w4
10: 2a0503e5 mov w5, w5
14: b40000c5 cbz x5, 0x2c <wasm[0]::function[0]+0x2c>
18: 9ac50868 udiv x8, x3, x5
1c: 9b058d02 msub x2, x8, x5, x3
20: a8c17bfd ldp x29, x30, [sp], #16
24: d50323df autibz
28: d65f03c0 ret
2c: 0000c11f udf #49439
(with rem_s
:
0: d503235f pacibz
4: a9bf7bfd stp x29, x30, [sp, #-16]!
8: 910003fd mov x29, sp
c: 52b00004 mov w4, #-2147483648
10: 93407c81 sxtw x1, w4
14: 92800003 mov x3, #-1
18: 9ac30c25 sdiv x5, x1, x3
1c: 9b0384a2 msub x2, x5, x3, x1
20: a8c17bfd ldp x29, x30, [sp], #16
24: d50323df autibz
28: d65f03c0 ret
)
At first I thought this was a bug, but it seems to be consistent across platform:
(module
(func (export "_start") (result i32)
(i32.const 0x80000000)
(i32.const -1)
i32.rem_s
)
)
this yields 0
on both aarch64 and x86.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, going over the Wasm spec, there's no explicit mention of overflow checks in this case.
I was my initial intention to factorize the check, but as it turns out, we don't need to check for overflow for |
9b4f318
to
cba0b31
Compare
Implement remainder operation for
aarch64
in winch.one more step in the #8321 series