Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do the bytes of a pointer have to stay in the same order? #558

Open
RalfJung opened this issue Feb 20, 2025 · 2 comments
Open

Do the bytes of a pointer have to stay in the same order? #558

RalfJung opened this issue Feb 20, 2025 · 2 comments
Labels
A-provenance Topic: Related to when which values have which provenance (but not which alias restrictions follow)

Comments

@RalfJung
Copy link
Member

There's another aspect of provenance that we haven't officially decided yet and that is implicitly answered by the current wording in rust-lang/reference#1664: do the individual bytes in a pointer "remember" where in the pointer they are, and have to be put back in the same order? Some formal models require this, and if we ever allow "taking apart" the bytes of a pointer in const-eval (rust-lang/const-eval#72) we'll also have to require this, but for runtime semantics we could decide either way.

The one example of code that I am aware of that breaks this requirement is XOR linked lists, which can be implemented in the semantics sketched in MiniRust right now but can't be implemented if bytes with provenance remember their position in the pointer. That's not exactly realistic code, but it is somewhat satisfying that (on architectures where pointers have at least 2 bytes), XOR linked lists can be implemented.

The main upside of requiring the same bytes in the same order is that it rules out pointer crimes like XOR linked lists so if there's some unexpected interactions there, we'd not be affected. It would also make the runtime semantics more consistent with the const-eval semantics. I am not aware of an optimization that would benefit from this UB, it's mostly a case of "ruling out some rather cursed programs to avoid locking ourselves into an unexpected corner". In some sense the model becomes a bit simpler since we can just say, pointer bytes must be put back together in the same order they started out as before they can be treated as a pointer again, but the actual op.sem would become more complicated because of the extra bookkeeping required to enforce this.

@RalfJung RalfJung added the A-provenance Topic: Related to when which values have which provenance (but not which alias restrictions follow) label Feb 20, 2025
@chorman0773
Copy link
Contributor

I'd certainly agree that ruling out pointer crimes is an upside 😛.

In a more serious argument, I think it shouldn't be difficult to track this along with the Pointer. The only thing would be that Pointer would need to be something more like PointerFragment and be different between bytes (but this is really only an u8 is 0..size_of::<usize>() extra state). We could then say that if a pointer is computed from provenance, where the provenance isn't coherent (different provenances, wrong place, or some having no provenance) you get a pointer without provenance.

@RalfJung
Copy link
Member Author

No, it's not difficult, but it is extra complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-provenance Topic: Related to when which values have which provenance (but not which alias restrictions follow)
Projects
None yet
Development

No branches or pull requests

2 participants