Add initial `math::u128` functions for core/math runtime. #2438

otrho · 2025-12-08T04:31:34Z

Re: #2383. This is a start at adding u128 maths support with add, sub and mul in both overflowing and wrapping versions.

The cycle counts are estimates - is there a way to formally get the cycle cost for a proc, or alternatively are there docs somewhere explaining when a dup or movup will cost 1, 2 or 3 cycles? I assume it's related to the distance from the top somehow.

The overflowing_mul implementation is pretty conservative and could probably be pared down a little.

In particular any time there is an overflow into the 129th bit or higher the value is coverted to a flag/bool and they are all OR'd together at the end (except add3 can create a '2' overflow, so they're summed and then converted again to a bool).

There are quite likely cases were we know definitely there can either be no overflow, or the overflow is already a bool and doesn't need conversion. But for now it's passing the tests and works, we can optimise as we go.

- `u128::overflowing_add` and `u128::wrapping_add`. - `u128::overflowing_sub` and `u128::wrapping_sub`. - `u128::overflowing_mul` and `u128::wrapping_mul`.

huitseeker · 2026-01-05T09:33:39Z

crates/lib/core/asm/math/u128.masm

+#! The input values are assumed to be represented using 32 bit limbs, but this is not checked.
+#! Stack transition is as follows:
+#! [bhh, bmh, bml, bll, ahh, amh, aml, all, ...] -> [chh, cmh, cml, cll, ...], where c = (a - b) % 2^128
+pub proc wrapping_sub(b: u128, a: u128) -> (u32, u128)


The return type here says (u32, u128) but the procedure only produces 4 stack elements (a u128). Looks like a copy-paste from overflowing_sub. Should this be -> u128 to match wrapping_add?

huitseeker · 2026-01-05T09:33:46Z

crates/lib/core/asm/math/u128.masm

+# -------------------------------------------
+#           f      rhh    rmh     rml     rll
+
+pub proc overflowing_mul(b: u128, a: u128) -> (u64, u128)


The return type says u64 for the overflow flag, but the u64 module uses i1 for overflow flags (see u64::overflowing_mul). The actual value here is just 0 or 1 from the final push.0 neq. Should this be (u32, u128) or (i1, u128) for consistency?

I'm actually not sure the signature in the u64::overflowing_mul is correct - or at least it conflicts with the doc comments for the same procedure. Specifically, I think it returns (u64, u64) rather than (i1, u64).

If that's correct, could we also implement u128::overflowing_mul to look like so?:

pub proc overflowing_mul(b: u128, a: u128) -> (u128, u128)

pub proc overflowing_mul(b: u128, a: u128) -> (u128, u128) would imply 8 felts being left on the stack IMO, which is obviously not quite right.

I think maybe the naming here may be confusing. What I meant is that semantically, our u64::overflowing_mul functions like a widening multiplication (e..g, multiplying two u64 values gives us a 128-bit result). Similarly, would be great if this also behaved like a widening multiplication - i.e., multiplying two u128 values gives us a u256 result (which requires 8 felts to represent).

@bobbinth Those are unusual semantics IMO - overflowing_mul in most contexts I've seen (including Rust), treat it as an operation that returns the result as the same type as the operands, but with an additional flag that indicates whether overflow occurred, in which case the result value itself has been wrapped (giving you the overflow from T::MAX).

To get the resulting "widened" value, you'd simply cast the T::MAX value and the overflow to the widened type, and add them together. That addition is guaranteed to fit in the widened type, so doesn't need to be checked.

If you want a single function that does the widening multiplication, I think it should be added in addition to overflowing_mul, e.g. carrying_mul from Rust. And build it on top of the overflowing_mul implementation (or perhaps, build overflowing_mul in terms of carrying_mul, whichever is most efficient).

IMO, people rarely want the widened value when using overflowing_mul, instead they want to catch overflow and treat it as an non-panicking error. It is certainly useful on its own.

The compiler currently uses overflowing_mul as I described above, and I'd assume most people would assume it works the way I outlined, so if the actual implementation is different, we definitely should fix that.

Yes, agreed that overflowing_mul is the right name for the semantics of this procedure. But the issue is that we use overflowing_mul to mean widening_mul (not carrying_mul) in other places (Rust also has widening_mul though, currently as nightly-only).

So, in a way, this PR introduces an inconsistency because we'll get:

For u32 and u64 overflowing_mul will mean widening multiplication.

For u128 it'll mean overflowing multiplication.

I think maybe a simple solution here is to rename u32 and u64 procedure/operations to widening_mul - but that would need to be done in a different PR.

Created #2558 for this.

Moved to 2558

How are we actually using the current semantics throughout existing MASM code? I know that there must be a lot of bugs as a result of that, because the compiler for sure treats that overflow value as a boolean, not a u32 value (and I'm just about 100% sure my interpretation was based on the actual Miden docs at some point, though perhaps the docs were fixed and I didn't catch that). Our compiler intrinsics that use u32overlowing_mul also appear to all be written with this assumption. We probably need to audit how many uses of u32overflowing_mul are making the same incorrect assumption I did, because I assume I'm not alone there.

All uses I've seen assume the "widening" semantics, but of course, I haven't seen them all. In many cases, using the upper bits as a boolean value could be fine - i.e., 0 still means there is no overflow. But yes, if we are checking it to be 1 to indicate that there was overflow, that'll be a bug.

The behavior of u32overflowing_mul is not consistent with u32overflowing_add (which has the correct semantics).

For addition, there is no difference between "overflow" and "widening" because even in case of widening you still get either 0 or 1 in the upper bits of the widened value. So, I think I was just trying to keep the naming consistent across addition and multiplication.

Since the u64 library is not being currently used outside the compiler (unless that has changed very recently), I think we can just fix the semantics of u64::overflowing_mul to be correct, and add a new u64::widening_mul function.

Agreed - and that's what I suggested in #2558.

For addition, there is no difference between "overflow" and "widening" because even in case of widening you still get either 0 or 1 in the upper bits of the widened value. So, I think I was just trying to keep the naming consistent across addition and multiplication.

The way the documentation treats it, the output on top of the stack is a boolean (which is effectively true, but misleading), while u32overflowing_mul shows that the output on top of the stack, i.e. [d, c] is $c←(a⋅b) \mod 2^{32}$ and $d←(a⋅b)/2^{32}$, which is unambiguous.

A quick immediate fix is to update the docs of u32overflowing_add to match u32overflowing_mul, so that one doesn't read the docs for u32overflowing_add and then assume the semantics of u32overflowing_mul are the same, just for multiplication. That's likely the mistake I made.

bitwalker

Aside from @huitseeker's comments, I think this looks good. We'll want to implement division in a subsequent PR, and we may need to make layout changes to both u64 and u128 in a future change, but neither of those need to happen here, so 👍 from me.

Overflowing ops now return `(i1, u128)` and fix a typo for `wrapping_sub`.

Dominik1999 · 2026-01-12T09:06:58Z

Can we still merge this?

bitwalker · 2026-01-12T19:06:04Z

@Dominik1999 Yes, though I should note that it will need to be reworked as part of the effort to make all the integer types that have multi-word layouts (e.g. u64 and u128) use a little-endian limb order (both currently use a big-endian layout). See #2564 for conversation around that.

For now though, I think we can go ahead and merge this, what do you think @bobbinth?

huitseeker · 2026-01-12T20:38:11Z

crates/lib/core/asm/math/u128.masm

+    dup.7 dup.4
+    u32overflowing_mul   # [o0 rll bhh bmh bml bll ahh amh aml all ...]
+
+    # aml x bml + o0 -> rml0


Tiny typo: this comment says aml x bml but should be aml x bll to match the equivalent comment in overflowing_mul (line 125). The code itself looks correct.

otrho requested a review from bobbinth December 8, 2025 04:31

otrho self-assigned this Dec 8, 2025

otrho added the stdlib Related to Miden standard library label Dec 8, 2025

otrho force-pushed the otrho/stdlib-u128 branch 2 times, most recently from 48a943c to 3999a75 Compare December 10, 2025 22:33

otrho requested a review from bitwalker December 10, 2025 22:33

otrho marked this pull request as ready for review December 10, 2025 22:59

otrho force-pushed the otrho/stdlib-u128 branch 3 times, most recently from ded8dfb to a5ca6c3 Compare December 17, 2025 02:59

otrho force-pushed the otrho/stdlib-u128 branch from a5ca6c3 to 8f91f76 Compare December 23, 2025 22:52

Add initial math::u128 functions for lib/core/math runtime.

fa7d600

- `u128::overflowing_add` and `u128::wrapping_add`. - `u128::overflowing_sub` and `u128::wrapping_sub`. - `u128::overflowing_mul` and `u128::wrapping_mul`.

otrho force-pushed the otrho/stdlib-u128 branch from 8f91f76 to fa7d600 Compare December 23, 2025 23:01

Merge branch 'next' into otrho/stdlib-u128

29ba2b5

huitseeker reviewed Jan 5, 2026

View reviewed changes

bitwalker approved these changes Jan 5, 2026

View reviewed changes

Fix function signatures for u128 procs.

d284eef

Overflowing ops now return `(i1, u128)` and fix a typo for `wrapping_sub`.

otrho force-pushed the otrho/stdlib-u128 branch from 8a3d93a to d284eef Compare January 6, 2026 01:07

bobbinth mentioned this pull request Jan 8, 2026

Align semantics of overflowing/widening multiplication with naming #2558

Open

huitseeker reviewed Jan 12, 2026

View reviewed changes

Add initial math::u128 functions for core/math runtime. #2438

Are you sure you want to change the base?

Add initial math::u128 functions for core/math runtime. #2438

Uh oh!

Conversation

otrho commented Dec 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bitwalker Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bitwalker left a comment

Choose a reason for hiding this comment

Uh oh!

Dominik1999 commented Jan 12, 2026

Uh oh!

bitwalker commented Jan 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add initial `math::u128` functions for core/math runtime. #2438

Add initial `math::u128` functions for core/math runtime. #2438

bitwalker Jan 8, 2026 •

edited

Loading