-
Notifications
You must be signed in to change notification settings - Fork 1.6k
RFC: const ergonomics for NonZero<T> #3786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
6ba5f7e
75f32cf
b001b8e
2f3ff05
19dacd1
6e48c32
ab5a452
c335cb6
bb973f4
3e72787
bec3ca4
7da7c13
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,390 @@ | ||
- Feature Name: `const-nonzero-ergonomics` | ||
- Start Date: 2025-03-07 | ||
- RFC PR: [rust-lang/rfcs#3786](https://github.com/rust-lang/rfcs/pull/3786) | ||
- Rust Issue: none | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
The `std::num::NonZero<T>` type allows non-zero integer semantics to be clearly expressed. Yet this | ||
type is only seamlessly usable if all APIs with non-zero semantics use this type, due to required | ||
to/from conversion at any API boundary that differs in its use of `NonZero`. | ||
|
||
The burden of these conversions is especially heavy in tests and examples. This RFC proposes new | ||
coercions to facilitate implicit conversion to `NonZero` from integer literals, simplifying usage | ||
in tests and examples, where succinctness and readability are paramount. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Using `NonZero` to express non-zero semantics is valuable because it leads to clarity of APIs, | ||
removes the need for some runtime zero-checks and creates potential for niche optimizations. | ||
|
||
In typical logic code, `NonZero<T>` values are validated early in the process, such as when parsing | ||
user input. We start with a value of integer type `T` which may or may not be zero, and we parse | ||
this into a `NonZero<T>` - if this succeeds, we know it is not zero and can skip zero-checks in any | ||
further calls, as long as the API surface uses `NonZero<T>`. | ||
|
||
**However, there is one area where `NonZero` has particularly poor ergonomics: tests and examples!** | ||
In test code, numeric constants are common. By switching a function parameter from `u32` to | ||
`NonZero<u32>` we add needless complexity to the test code. | ||
|
||
Without `NonZero`: | ||
|
||
```rust | ||
fn item_fits_exactly_in_packaging(height: u32) -> bool { | ||
assert_ne!(0, height, "cannot package a product with a height of zero"); | ||
1000 % height == 0 | ||
} | ||
|
||
#[test] | ||
fn item_fits_exactly_in_packaging_if_divides_1000() { | ||
// The packaging has a height of 1000, so any integer that divides it evenly will fit. | ||
assert!(item_fits_exactly_in_packaging(1)); | ||
assert!(!item_fits_exactly_in_packaging(3)); | ||
assert!(item_fits_exactly_in_packaging(25)); | ||
assert!(!item_fits_exactly_in_packaging(999)); | ||
assert!(item_fits_exactly_in_packaging(1000)); | ||
} | ||
``` | ||
|
||
With `NonZero`: | ||
|
||
```rust | ||
use std::num::NonZero; | ||
|
||
fn item_fits_exactly_in_packaging(height: NonZero<u32>) -> bool { | ||
// No need to worry about division by zero because we accept NonZero input. | ||
// This means we can avoid checking the denominator in every call to this function. | ||
1000 % height.get() == 0 | ||
} | ||
|
||
#[test] | ||
fn item_fits_exactly_in_packaging_if_divides_1000() { | ||
// The packaging has a height of 1000, so any integer that divides it evenly will fit. | ||
assert!(item_fits_exactly_in_packaging(NonZero::new(1).unwrap())); | ||
assert!(!item_fits_exactly_in_packaging(NonZero::new(3).unwrap())); | ||
assert!(item_fits_exactly_in_packaging(NonZero::new(25).unwrap())); | ||
assert!(!item_fits_exactly_in_packaging(NonZero::new(999).unwrap())); | ||
assert!(item_fits_exactly_in_packaging(NonZero::new(1000).unwrap())); | ||
} | ||
``` | ||
|
||
Having to manually construct the `NonZero` wrapper in test code can become very noisy and is | ||
especially problematic in example code and doctests, where crate authors want to put their best | ||
foot forward, to show off how easy it is to use the crate. This is because the nature of `NonZero` | ||
creates a difference between the two categories of usage: | ||
|
||
* In real use, the `NonZero` is typically constructed when parsing the input, not when calling some | ||
API that expresses its parameters via `NonZero`. | ||
* In test and example use, there often is no parsing stage, so hardcoded test input must be manually wrapped at | ||
call sites. | ||
|
||
**This means that tests and examples are much more noisy than real-world usage for an API that uses | ||
`NonZero`, giving a false impression of API complexity and discouraging API authors from using | ||
`NonZero` despite its advantages.** | ||
|
||
In test and example code this commonly occurs with integer literals. The values are | ||
known and can be validated at compile time to be non-zero and within expected bounds. | ||
|
||
This RFC proposes that we omit the ceremony with literals, allowing implicit coercion of non-zero | ||
integer literals to `NonZero<T>`, thereby encouraging Rust crate authors to | ||
make more extensive use of `NonZero`, which they may today choose to avoid due to the extra cruft | ||
it adds to tests and examples. | ||
|
||
With this RFC implemented, this would be valid code: | ||
|
||
```rust | ||
use std::num::NonZero; | ||
|
||
fn item_fits_exactly_in_packaging(height: NonZero<u32>) -> bool { | ||
// No need to worry about division by zero because we accept NonZero input. | ||
// This means we can avoid checking the denominator in every call to this function. | ||
1000 % height.get() == 0 | ||
} | ||
|
||
#[test] | ||
fn item_fits_exactly_in_packaging_if_divides_1000() { | ||
// The packaging has a height of 1000, so any integer that divides it evenly will fit. | ||
assert!(item_fits_exactly_in_packaging(1)); | ||
assert!(!item_fits_exactly_in_packaging(3)); | ||
assert!(item_fits_exactly_in_packaging(25)); | ||
assert!(!item_fits_exactly_in_packaging(999)); | ||
assert!(item_fits_exactly_in_packaging(1000)); | ||
} | ||
``` | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
The `std::num::NonZero<T>` type can be used to express that an integer of type `T` has a non-zero | ||
value. This helps clearly express the intent of the API and encourages "thinking with types", | ||
whereby function calls are maximally validated at compile time, reaching a state where if the code | ||
compiles, it has a high probability of being correct. | ||
|
||
In typical usage, you convert from a `T` such as `u32` into a `NonZero<T>` when parsing the input | ||
and thereafter keep the value in the `NonZero` wrapper: | ||
|
||
```rust ignore | ||
use std::{env, num::NonZero}; | ||
|
||
fn main() { | ||
let Some(product_height) = env::args().nth(2) else { | ||
eprintln!("provide an integer as the first argument ('product_height') to this sample app"); | ||
return; | ||
}; | ||
|
||
let product_height = product_height | ||
.parse::<u32>() | ||
.expect("first argument ('product_height') must be an integer"); | ||
|
||
// We validate that integers are non-zero as soon as possible and keep them in the | ||
// NonZero<T> wrapper type after that, to avoid further unnecessary zero-checks. | ||
let Some(product_height) = NonZero::new(product_height) else { | ||
eprintln!("first argument ('product_height') must be non-zero"); | ||
return; | ||
}; | ||
|
||
// .. | ||
} | ||
``` | ||
|
||
You can pass this value as-is to functions that take `NonZero<u32>`: | ||
|
||
```rust ignore | ||
fn main() { | ||
// .. | ||
|
||
if !item_fits_exactly_in_packaging(product_height) { | ||
eprintln!("product does not fit in packaging"); | ||
return; | ||
} | ||
|
||
println!("product fits in packaging"); | ||
} | ||
|
||
fn item_fits_exactly_in_packaging(height: NonZero<u32>) -> bool { | ||
// No need to worry about division by zero because we accept NonZero input. | ||
// This means we can avoid checking the denominator in every call to this function. | ||
1000 % height.get() == 0 | ||
} | ||
``` | ||
|
||
When writing test or example code and using hardcoded constants, you can omit the conversion into | ||
`NonZero<T>` - it is done implicitly at compile time. This only works with integer literals | ||
like `1234`. | ||
|
||
```rust ignore | ||
#[test] | ||
fn item_fits_exactly_in_packaging_if_divides_1000() { | ||
// The packaging has a height of 1000, so any integer that divides it evenly will fit. | ||
assert!(item_fits_exactly_in_packaging(1)); | ||
assert!(!item_fits_exactly_in_packaging(3)); | ||
assert!(item_fits_exactly_in_packaging(25)); | ||
assert!(!item_fits_exactly_in_packaging(999)); | ||
assert!(item_fits_exactly_in_packaging(1000)); | ||
} | ||
``` | ||
|
||
This is similar to how in the statement `let foo: u8 = 123;` | ||
the literal `123` is inferred to be `u8`. | ||
|
||
Being able to skip the `NonZero::new` helps avoid unnecessary complexity in tests, encouraging high | ||
test coverage and making test code easier to read. It also helps focus examples and doctests on the | ||
usage of the `NonZero` capable API that you are publishing, without unnecessary noise from parsing | ||
of constant values that would typically happen elsewhere in real code. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
Untyped integer literals (e.g., `123`, `0xFF` but not `123_u32`) are implicitly coerced | ||
to `std::num::NonZero<T>` (`T` being `u8`, `u32`, etc.) if all of the following are true: | ||
|
||
* The value of the literal is not 0. | ||
* The value of the literal fits within `T`’s range (e.g., `300` fails for `NonZero<u8>`). | ||
* The target type is explicitly `NonZero<T>` or inferred as such. | ||
* The type `T` is unambiguously resolved from the target `NonZero<T>` type. | ||
|
||
The coercion happens at compile time, with the emitted code being the equivalent of | ||
`const { NonZero::new(literal).unwrap() }` for valid cases, with no runtime checks. | ||
|
||
The coercion is allowed when the source is an integer literal (`123`) or an integer literal | ||
negation expression (`-123`). The coercion does not apply to other expressions besides unary | ||
negation, even if the expressions combine literals (e.g. `123 + 456` does not qualify). | ||
|
||
```rust ignore | ||
fn foo(count: NonZero<i8>) { } | ||
|
||
foo(123); // OK | ||
foo(-123); // OK | ||
foo(0x11); // OK | ||
|
||
foo(0); // Error - value cannot be zero. | ||
foo(300); // Error - out of bounds of i8. | ||
foo(123_i8); // Error - only untyped literals accepted. | ||
foo(123_usize); // Error - literal has non-matching type. | ||
foo(123 - 1); // Error - coercion not applied for expressions. | ||
|
||
const MAGIC_VALUE: NonZero<i8> = 123; // OK - coercion logic is same as when calling a fn. | ||
|
||
let i = 123; | ||
foo(i); // Error - the coercion only applies to literals and `i` is not a literal. | ||
``` | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Any implicit behavior in the language risks becoming "magic" that is hard to understand and reason | ||
about. For this reason such behaviors are very uncommon in Rust. The greatest drawback is that we | ||
may open a box that also contains some demons in addition to this simple enhancement - what other | ||
implicit behaviors might get proposed using this as an example? The fact that this RFC is scoped | ||
to constants and not variables is the mitigating factor here. After all in `let foo: u8 = 123` the | ||
literals becomes a `u8` and is not too different in nature from `let foo: NonZero<u8> = 123`, though | ||
the mechanics are of course different. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives` | ||
|
||
An alternative with less magic would be to require an explicit "nonzero" suffix for constant | ||
literals to activate this coercion: | ||
|
||
```rust | ||
fn foo(value: NonZero<u32>) {} | ||
|
||
foo(1234); // Error | ||
foo(1234_nz); // OK | ||
foo(1234_u32_nz); // OK | ||
``` | ||
|
||
This would still eliminate most of the cruft from tests and examples while still requiring an | ||
explicit action by the user to invoke the coercion. | ||
|
||
Given that the potential for "confusion by too much magic" seems minimal as the concept of | ||
"non-zero" is trivial to understand and the proposal does not involve any size-coercion, | ||
the value of avoiding fully implicit coercion here seems low - there is not an obvious benefit | ||
from only taking half a step toward an improvement. | ||
|
||
If we consider a library approach rather than a language approach, one viable alternative is to | ||
create a macro to easily construct non-zero values and validate that they really are non-zero. | ||
For example: | ||
|
||
```rust | ||
use std::num::nonzero; | ||
|
||
fn foo(value: NonZero<u32>) {} | ||
|
||
foo(1234); // Error | ||
foo(nonzero!(1234)); // OK | ||
|
||
macro_rules! nonzero { | ||
($x:literal) => { | ||
const { ::std::num::NonZero::new($x).expect("literal must have non-zero value") } | ||
}; | ||
} | ||
``` | ||
|
||
This macro will attempt to construct the value in a const context and emit a meaningful error if | ||
that fails. This may offer a satisfactory user experience if such a macro were included in `std`, | ||
though still comes at a cost of an unusual level of cruft for merely passing integer constants to | ||
code, and remains something the user of the code has to think and know about. | ||
|
||
Given that passing integer constants to code is an everyday task for a programming language, that | ||
the `NonZero` wrapper type exists already in `std` and that there are so far no known corner cases | ||
that could not be easily verified correct at compile time, having it "just work" seems to offer the | ||
best tradeoff of factors to encourage wider usage of non-zero semantics, yielding more correct and | ||
potentially more efficient code in APIs where `NonZero` is applicable. | ||
|
||
That said, all these alternatives are better than what we have today - manually having to construct | ||
an instance of `NonZero<T>`. | ||
|
||
[Previous discussions](https://internals.rust-lang.org/t/elevate-nonzero-types-to-true-primitive-integer-types/18040/20) | ||
have suggested that other ergonomic challenges exist with `NonZero` (e.g. because it lacks many of | ||
the methods that you would expect to find on primitive integer types it wraps) and that improving | ||
its ergonomics across the board would likely be challenging, if not due to the mechanics then at | ||
least due to the scope of functionality associated with numeric types. Nevertheless, this does not | ||
appear to be a meaningful argument against making `NonZero` ergonomics better for the scenarios | ||
where it is useful to Rust users today, even if those cases are more limited than raw numerics. | ||
In practice, `NonZero` types do appear to be gaining new members, so | ||
there does not appear to be a consensus against improving them (e.g. 1.84 stabilized | ||
[`NonZero::isqrt`](https://doc.rust-lang.org/std/num/struct.NonZero.html#method.isqrt)) | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
Exploration of other languages suggests that while refinement types like `NonZero` are common, they | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in terms of user-defined literals there is quite a big prior art not mentioned here which is C++11 (of course C++ also has implicit constructors and exceptions so they could have back to Rust, we could for instance introduce a const trait pub trait /*const*/ FromIntLiteral: Sized {
const fn from_int_literal(input: u128) -> Self;
}
impl<T: FromIntLiteral + ZeroablePrimitive> FromIntLiteral for NonZero<T> {
const fn from_int_literal(input: u128) -> Self {
Self::new(T::from_int_literal(input)).expect("input must be non-zero")
}
} and then using non-suffixed integer literal would be equivalent to using assert!(item_fits_exactly_in_packaging(1000));
// equivalent to
assert!(item_fits_exactly_in_packaging(const { FromIntLiteral::from_int_literal(1000_u128) }));
const NEGATIVE_ONE: NonZero<i32> = -1;
// equivalent to (assuming `impl const Neg for NonZero<i32>`)
const NEGATIVE_ONE: NonZero<i32> = -const { FromIntLiteral::from_int_literal(1_u128) }; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That trait could potentially be made more general by passing the constant as a |
||
generally require explicit conversion as they are not specific to integers as a general language | ||
feature. In contrast, this RFC deals with integer refinement in particular, as the `NonZero` types | ||
are focused specifically on this mode of refinement. | ||
|
||
Similar functionality appears to be present in Ada if we use its subtyping feature to | ||
define a `NonZero_Int` type: | ||
|
||
```ada | ||
with Ada.Text_IO; use Ada.Text_IO; | ||
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; | ||
|
||
procedure Item_Fits_Test is | ||
subtype NonZero_Int is Integer range 1 .. Integer'Last; | ||
|
||
function Item_Fits_Exactly_In_Packaging(Height : NonZero_Int) return Boolean is | ||
begin | ||
-- No need to check for zero; the subtype guarantees it's non-zero | ||
return 1000 mod Height = 0; | ||
end Item_Fits_Exactly_In_Packaging; | ||
|
||
procedure Test_Item_Fits_Exactly_In_Packaging_If_Divides_1000 is | ||
Result : Boolean; | ||
begin | ||
Result := Item_Fits_Exactly_In_Packaging(1); | ||
pragma Assert(Result, "1 should divide 1000 evenly"); | ||
Put_Line("Test 1 passed"); | ||
|
||
Result := Item_Fits_Exactly_In_Packaging(3); | ||
pragma Assert(not Result, "3 should not divide 1000 evenly"); | ||
Put_Line("Test 3 passed"); | ||
|
||
Result := Item_Fits_Exactly_In_Packaging(25); | ||
pragma Assert(Result, "25 should divide 1000 evenly"); | ||
Put_Line("Test 25 passed"); | ||
|
||
Result := Item_Fits_Exactly_In_Packaging(999); | ||
pragma Assert(not Result, "999 should not divide 1000 evenly"); | ||
Put_Line("Test 999 passed"); | ||
|
||
Result := Item_Fits_Exactly_In_Packaging(1000); | ||
pragma Assert(Result, "1000 should divide 1000 evenly"); | ||
Put_Line("Test 1000 passed"); | ||
|
||
-- This would cause a compile-time error if uncommented: | ||
-- Result := Item_Fits_Exactly_In_Packaging(0); | ||
-- Error: constraint error, 0 not in range 1 .. Integer'Last | ||
end Test_Item_Fits_Exactly_In_Packaging_If_Divides_1000; | ||
|
||
begin | ||
Test_Item_Fits_Exactly_In_Packaging_If_Divides_1000; | ||
Put_Line("All tests completed successfully!"); | ||
end Item_Fits_Test; | ||
``` | ||
|
||
Pattern types [rust#123646](https://github.com/rust-lang/rust/issues/123646) are a generalized form | ||
of the concept underpinning `NonZero` and may offer a generalized solution to the problems described | ||
in this RFC, if they are adopted into Rust. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
The alternatives presented above appear inferior but only slightly so - we should carefully consider | ||
which strategy to apply here, especially if there appear corner cases not yet explored, where the | ||
different approaches may show off their respective strengths. | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
If we accept that `NonZero<T>` deserves to be implicitly coerced from non-zero values of `T`, the | ||
experience from implementing and stabilizing this may offer valuable insights for how deeply and | ||
how explicitly/implicitly to integrate other bounded/restricted types such as | ||
`Bounded<T, Min, Max>` with custom minimum and maximum values or other types of refinement. | ||
This is out of scope of this RFC. Related discussions: | ||
* [Is there a type like `std::num::NonZero*` but for other values than zero?](https://www.reddit.com/r/rust/comments/x0lwxt/is_there_a_type_like_stdnumnonzero_but_for_other/) | ||
* [Range types for integers (or refinement types?)](https://github.com/rust-lang/rfcs/issues/671) | ||
* [Pattern types MVP](https://github.com/rust-lang/rust/pull/107606) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this feature work if the user has written this?
i guess no because the
u32
suffix forces the literal to have typeu32
rather than any possible integers.and what about literals with const-folding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent questions! Applying such a coercion in these cases does seem to raise more problems than the benefit it may bring (if any). I will amend the RFC to clarify that the proposed behavior here is quite conservative (unless further discussion in here brings up reasons to make some bolder moves).
For the
1000u32
case, the proposed behavior would be that the coercion is applied only if a single integer type can be identified for the coercion:123
), theT
must be unambiguously resolved from the target type.123u32
), theT
in the target type must either match or be inferred to the literal's type.For the const-folding case, I believe we can have the behavior be conservative by having the coercion apply only to literals - if the source type is from an expression like
123 | 456
then we simply do not considerNonZero
coercion as a valid candidate.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a current-editions behaviour I totally agree, but one Future Possibility that might be interesting to consider would be allowing rust to combine
123 | 456
to a579
literal, and then apply the coercion to that.That way you could write out whatever with literals and the compiler would just evaluate it like python does (using infinite precision if needed) then only converting back to a "real" rust type once it's given a specific type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we ever have literals convert to user-defined types, I would argue that Rust should not do operations to combine literals before converting to the user type, since that user type may not use standard arithmetic, e.g. if it is for carry-less arithmetic (or equivalently operations on$GF(2)[X]$ ), where
0b101 + 0b111 == 0b010
and0b101 * 0b111 == 0b11011
. So, if Rust tried to do arithmetic on literals before conversion, it would produce the wrong result.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do support
I'd say it would be surprising to some portion of users whether
x == Gf2x(0b10)
orx == Gf2x(0b1100)
.The former is essentially how Rust currently infers the integer literal type, as evidenced in:
Since we can't guarantee an operator being homomorphic for a custom type, indeed the conservative stance is the best, in which we only allow conversion on
$:literal
(with arbitrary number of parentheses).The question is whether
-$:literal
should be allowed, because using the same argument theimpl Neg for G
may be non-standard such thatG(-x) != -G(x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of why the API I proposed has negation separated out is that Rust does treat negation specially:
so, if
-MyType(x)
isn't the same asMyType(-x)
the conversion from literal toMyType
can see that the literal is negated (because that's passed into the literal conversion rather than callingNeg
) and do the appropriate value transformation.