Skip to content

Insert checks for enum discriminants when debug assertions are enabled #141759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

1c3t3a
Copy link
Member

@1c3t3a 1c3t3a commented May 30, 2025

Similar to the existing null-pointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following:

let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };

An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++.

This check is similar to Miri's capabilities of checking for valid construction of enum values.

This PR is inspired by saethlin@'s PR
#104862. Thank you so much for keeping this code up and the detailed comments!

I also pair-programmed large parts of this together with vabr-g@.

r? @saethlin

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 30, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 30, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @vakaras

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

@rustbot

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 6d3fe75 to a7dd718 Compare May 30, 2025 09:46
@rustbot

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from a7dd718 to 4f3342e Compare May 30, 2025 09:59
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 54b6e74 to b03960e Compare May 30, 2025 13:33
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from b03960e to 228b656 Compare May 30, 2025 13:59
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 228b656 to d1d8f88 Compare June 2, 2025 14:34
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from d1d8f88 to 93b24d7 Compare June 2, 2025 20:23
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 93b24d7 to c2a8415 Compare June 3, 2025 12:31
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from c2a8415 to d769d6b Compare June 4, 2025 01:51
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from d769d6b to 68665ad Compare June 4, 2025 02:32
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 68665ad to 1225079 Compare June 6, 2025 15:35
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 1225079 to c52f534 Compare June 6, 2025 19:54
@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 06f752d to db4a7f9 Compare June 10, 2025 09:32
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from db4a7f9 to 33e8914 Compare June 10, 2025 09:47
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 33e8914 to 8fd6814 Compare June 10, 2025 11:41
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 8fd6814 to d37a37e Compare June 10, 2025 14:35
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from d37a37e to 33890a1 Compare June 10, 2025 15:18
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from 33890a1 to c871c4c Compare June 10, 2025 17:06
Similar to the existing nullpointer and alignment checks, this checks
for valid enum discriminants on creation of enums through unsafe
transmutes. Essentially this sanitizes patterns like the following:
```rust
let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };
```
An extension of this check will be done in a follow-up that explicitly
sanitizes for extern enum values that come into Rust from e.g. C/C++.

This check is similar to Miri's capabilities of checking for valid
construction of enum values.

This PR is inspired by saethlin@'s PR
rust-lang#104862. Thank you so much for
keeping this code up and the detailed comments!

I also pair-programmed large parts of this together with vabr-g@.
@rust-log-analyzer

This comment has been minimized.

@1c3t3a 1c3t3a force-pushed the discriminants-query branch from c871c4c to 5587fd7 Compare June 10, 2025 17:57
@Kobzol
Copy link
Contributor

Kobzol commented Jun 10, 2025

@bors2 try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Jun 10, 2025

⌛ Trying commit 5587fd7 with merge 7488b2b

To cancel the try build, run the command @bors2 try cancel.

rust-bors bot added a commit that referenced this pull request Jun 10, 2025
Insert checks for enum discriminants when debug assertions are enabled

Similar to the existing null-pointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following:
```rust
let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };
```

An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++.

This check is similar to Miri's capabilities of checking for valid construction of enum values.

This PR is inspired by saethlin@'s PR
#104862. Thank you so much for keeping this code up and the detailed comments!

I also pair-programmed large parts of this together with vabr-g@.

r? `@saethlin`
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 10, 2025
@1c3t3a
Copy link
Member Author

1c3t3a commented Jun 10, 2025

This patch is finally ready for review! Let's see what the perf-impact of this is, but I wouldn't assume it is much, as this only emits checks for transmutes to enums.

@rust-bors
Copy link

rust-bors bot commented Jun 10, 2025

☀️ Try build successful (CI)
Build commit: 7488b2b (7488b2b1de7ccc9272a1b2f6015d82794d20c065)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (7488b2b): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (secondary 3.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.9% [1.2%, 8.6%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results (secondary -0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.5% [2.5%, 2.5%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.3% [-3.3%, -3.3%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 754.321s -> 757.215s (0.38%)
Artifact size: 372.15 MiB -> 372.16 MiB (0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 10, 2025
/// In some cases the enum discriminant is stored in a tag that is represented by
/// primitive. This method returns the actual discriminant type and size for that
/// tag.
fn tag_type_and_size_for_primitive(&self, primitive: Primitive) -> (Ty<'tcx>, Size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this feels like something that we've got to have somewhere already.

In a quick search I only found the codegen-side versions, though, like

fn type_from_integer(&self, i: Integer) -> Self::Type {
use Integer::*;
match i {
I8 => self.type_i8(),
I16 => self.type_i16(),
I32 => self.type_i32(),
I64 => self.type_i64(),
I128 => self.type_i128(),
}
}
.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I felt the same when I was writing this code but also only found this in codegen. I can look again, but would it make any sense to have this as a member of Primitive?

Comment on lines +548 to +579
let lower_boundary_ok: Place<'_> =
local_decls.push(LocalDecl::with_source_info(tcx.types.bool, source_info)).into();
block_data.statements.push(Statement {
source_info,
kind: StatementKind::Assign(Box::new((
lower_boundary_ok,
Rvalue::BinaryOp(BinOp::Le, Box::new((start_const, Operand::Copy(discr)))),
))),
});
let upper_boundary_ok: Place<'_> =
local_decls.push(LocalDecl::with_source_info(tcx.types.bool, source_info)).into();
block_data.statements.push(Statement {
source_info,
kind: StatementKind::Assign(Box::new((
upper_boundary_ok,
Rvalue::BinaryOp(BinOp::Le, Box::new((Operand::Copy(discr), end_const))),
))),
});

let is_ok: Place<'_> =
local_decls.push(LocalDecl::with_source_info(tcx.types.bool, source_info)).into();
block_data.statements.push(Statement {
source_info,
kind: StatementKind::Assign(Box::new((
is_ok,
Rvalue::BinaryOp(
// This is a `WrappingRange`, so make sure to get the wrapping right.
if valid_range.start <= valid_range.end { BinOp::BitAnd } else { BinOp::BitOr },
Box::new((Operand::Copy(lower_boundary_ok), Operand::Copy(upper_boundary_ok))),
),
))),
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: if you're checking a wrapping range, use (x - START) ULE (END - START) -- then you don't need to worry about combining things because in wrapping subtraction it always works. (See #135674 for somewhere I did something similar.)

source_op: Operand<'tcx>,
discr_ty: Ty<'tcx>,
discr_size: Size,
op_size: Size,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: maybe you can pass the Ty or Primitive or Integer or something instead of the Size? The Primitive at least is definitely available from the layout info.

(That might help avoid some matches.)

});
}

// Branch based on the computed equality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: If you want to branch based on a set of values, how about inserting a https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.TerminatorKind.html#variant.SwitchInt instead? That can branch to the "check failed" block from the otherwise, and continue successfully from all the valid values.

))),
});

// Loop over the list of the discriminants and insert checks for equality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this can be a very large amount of additional MIR. I worry about things like the transmute from usize to ptr::Alignment, for example -- adding another, what, at least 128 MIR statements every time that happens?

/// This pass inserts checks at places where enums are constructed and checks
/// the operand passed to the enum creation for validity. This prevents
/// creating enums backed by invalid discriminants.
pub(super) struct CheckEnums;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix: you need some tests/mir-opt tests for the pass, at least unit test ones showing that it generates the expected MIR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants