Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not reference LLVM in our definition of UB #1750

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Mar 5, 2025

We should define what is and is not UB ourselves. In particular for the aliasing model, it is far from clear that we will exactly copy what LLVM does. So let's avoid pointing to LLVM here.

  • For the aliasing model, add a one-sentence summary.
  • For "uninitialized memory", I don't think we define that anywhere yet -- but linking to LLVM's undef is not really helpful, either. For now, I think the term itself is sufficiently clear for a first start, and making it more precise requires spelling our the operational semantics in more detail -- that's not the job of this summary page.

@rustbot rustbot added the S-waiting-on-review Status: The marked PR is awaiting review from a maintainer label Mar 5, 2025
`&T` must point to memory that is not mutated while they are live (except for data inside an [`UnsafeCell<U>`]),
and `&mut T` must point to memory that is not read or written by any pointer not derived from the reference and that no other reference points to while they are live.
`Box<T>` is treated similar to `&'static mut T` for the purpose of these rules.
The exact liveness duration is not specified, but some bounds exist:
* For references, the liveness duration is upper-bounded by the syntactic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a lower-bound too, right? e.g., if you have a use of some reference x at time T1 and then a subsequent use of x (or some reference derived from x) at time T2, then it is live from T1 ..= T2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that I didn't change this part in this PR.)

Yes, of course each time a reference is dereferenced, it must be live. That is a lower bound.

Copy link
Contributor

@traviscross traviscross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK broadly; I know @ehuss is planning to revise editorially.

@ehuss ehuss self-assigned this Mar 13, 2025
LLVM’s scoped [noalias] model, except if the `&T` contains an
[`UnsafeCell<U>`]. References and boxes must not be [dangling] while they are
live. The exact liveness duration is not specified, but some bounds exist:
* Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is a rough sketch of what the requirements look like:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were somewhat uncomfortable with the "rough sketch" nature of this statement. Are the statements here accurate on their own? "rough sketch" implies to me that this isn't just incomplete, but possibly wrong in some way, so I wanted to clarify this point.

If this is just incomplete, can we delete this whole sentence? Like this:

Suggested change
* Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is a rough sketch of what the requirements look like:
* Breaking the pointer aliasing rules:

There is a giant warning just above this that this is not complete, so I was wondering if that covers your concern about the status here?

Or if you wanted to really point to this, maybe a footnote like [^alias-incomplete]: The exact aliasing rules are not determined yet.

Or maybe another way to phrase that if the rules written here are accurate, but just incomplete: The exact aliasing rules are incomplete.

Copy link
Member Author

@RalfJung RalfJung Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statements given here are not on their own sufficient to confidently write sound code. That is already the case before this PR, so nothing changes, we're just being more honest. The aliasing model is the least stable part of our unsafe code semantics so I feel they deserve a dedicated warning.

I don't know what you mean by "accurate and incomplete". I assume you do not mean "complete" in the formal sense of a logic (soundness/completeness), but I also see no way in which an "incomplete" set of rules could be described as "accurate".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what I was getting at is viewing it as a bound. The current text says X. There might be additional points that relax X in the future, but we haven't determined them, yet. I would say X is still correct, just incomplete. But perhaps you are saying there are too many unknowns to even say that? Or that X is too permissive, and we will be adding constraints in the future, in which case X is incorrect in the face of those constraints?

Regardless, having a specific warning here seems worthwhile, though we (on the team call) felt a little uncomfortable with the current wording. How about doing something like pulling the sentence out into a dedicated warning block with something like:

Suggested change
* Breaking the pointer aliasing rules. The exact aliasing rules are not determined yet, but here is a rough sketch of what the requirements look like:
* Breaking the pointer aliasing rules.
> [!WARNING]
> These aliasing rules are incomplete and are still under development.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But perhaps you are saying there are too many unknowns to even say that?

Yeah, I am not very comfortable making any hard promises of the form "if you follow these rules, you're definitely good" -- not for the case where references and raw pointers are interleaved in any way. We could conceivably carve out a fragment that must work, where you only ever use raw pointers or references but never both "at the same time", but it'd be a bunch of work to spell this out in sufficient detail, and doing so was not my intent with this PR.

How about doing something like pulling the sentence out into a dedicated warning block with something like:

That still basically say that what follows are "rules", that might just be altered later. I wouldn't call them rules. They are a sketch or a guiding principle that rules will be developed from, but they are nowhere near precise enough to be described as a set of "rules".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced "rough sketch" by "outline", is that better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: The marked PR is awaiting review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants