Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri build preventing landing changes which require both lib and compiler work #138065

Closed
adetaylor opened this issue Mar 5, 2025 · 17 comments
Closed
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)

Comments

@adetaylor
Copy link
Contributor

Summary

Some changes to Rust require synchronized changes to the library and compiler together. #135881 is an example.

This used to be possible by judicious use of #[cfg(bootstrap)], but I rebased that PR on top of master and there no longer seems to be a way to get the miri build to work.

This seems to be because the miri build is now using a newer compiler, but an older version of the library (possibly from stage0).

Reproduction

Check out 7f74340, the older version of #135881, which was based on master from Feb 6th. (If you need to fetch that from github, it can be found here).

./x.py build miri works.

Now check out 632fd28, which is exactly the same PR based on master from yesterday. The same command does not work, emitting lots of errors about methods not being found.

Diagnosis

This seems to be because the miri build can no longer find the blanket implementation of Receiver for T: Deref. I think this is because it's using stage0 stdlib.

Bisecting this has been hard for various fiddly reasons, but a recent change has been from x.py reporting:

Building tool miri (stage1 -> stage2, aarch64-apple-darwin)

to instead reporting

Building tool miri (stage0 -> stage1, aarch64-apple-darwin)

I'm reasonably confident (but not certain) that this is the change that has altered behavior here. The commits which made these changes were 8e011e5 or 72e67e8. Before that, we see the stage1 -> stage2 behavior, and afterwards, we see stage0 -> stage1.

That said, even after those two commits, it's still possible to get the PR to build, so there's some other factor involved in the 1000+ subsequent commits. As noted, bisecting this has been enormously painful and I haven't nailed it down yet. Nevertheless I feel this is highly likely to be related so I'm hoping @onur-ozkan can help investigate.

@adetaylor adetaylor added C-bug Category: This is a bug. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Mar 5, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 5, 2025
@onur-ozkan
Copy link
Member

Could you check if it works with '--stage 2'?

@onur-ozkan onur-ozkan removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 5, 2025
@adetaylor
Copy link
Contributor Author

Could you check if it works with '--stage 2'?

Yes, it does. It needs a small change to the feature flags in src/tools/miri/src/lib.rs but then it builds fine.

@onur-ozkan
Copy link
Member

I'm reasonably confident (but not certain) that this is the change that has altered behavior here. The commits which made these changes were 8e011e5 or 72e67e8. Before that, we see the stage1 -> stage2 behavior, and afterwards, we see stage0 -> stage1.

That is intentional. By default x build uses stage 1. Previously, building rustc tools would trigger a stage N+1 compiler and tools build but that behavior was fixed in #137215. I guess your case requires a stage 2 miri? If so, since x build defaults to stage 1, you now need to explicitly pass --stage 2 to the build command. Previously, stage 1 was being incremented automatically for miri (and other tools like rustfmt, clippy, cargo, etc.), which made it work "magically".

@adetaylor
Copy link
Contributor Author

OK, so should we make a change to Rust's github CI to do that? It presumably builds miri without such arguments? (Example failing job).

@onur-ozkan
Copy link
Member

That job failed on x check, which uses stage 0 (as it says Checking stage0 miri artifacts (x86_64-unknown-linux-gnu -> i686-pc-windows-gnu) in the build log).

Do you have any opinion on this? @RalfJung

IIRC we did not allow using stage 0 on Miri somewhere in bootstrap, I guess it was in the tests module.

@RalfJung
Copy link
Member

RalfJung commented Mar 5, 2025

I don't know what changed about staging or why -- it seems that the staging scheme I originally set up worked fine, and then something recently regressed?

Miri should work in stage 0 (or whatever you now call the first stage), that's the default development mode. It would be quite bad if working on Miri required building rustc twice.

@RalfJung
Copy link
Member

RalfJung commented Mar 5, 2025

Previously, stage 1 was being incremented automatically for miri (and other tools like rustfmt, clippy, cargo, etc.), which made it work "magically".

That should be largely cosmetic change though. The relevant part is: before 8e011e5 and 72e67e8, a "build rustc only once" build of Miri worked fine with this coordinated library/compiler change. After those commits, that does not work any more.

Maybe it's just a --cfg bootstrap too much or too little somewhere?

@onur-ozkan
Copy link
Member

onur-ozkan commented Mar 6, 2025

Yes, that could also be the reason as well.

#138081 seems like a good candidate to check for this. As far as I can see x check worked fine without any problem there (see the job).

I also did a manual test by rebasing over it and x build miri worked fine too.

@adetaylor
Copy link
Contributor Author

@onur-ozkan Can you suggest any change I can make to #135881 to get it to pass github CI? (Otherwise, this issue is one of only two things preventing stabilization of arbitrary self types). I don't think #138081 is a similar example, because the compiler changes do not require corresponding stdlib changes. In the case of #135881, they do.

@RalfJung
Copy link
Member

RalfJung commented Mar 7, 2025

Is there any way to stage this such that it does not have to be a lock-step change?

@adetaylor
Copy link
Contributor Author

I think so, if that's the best way forward. The libs changed should be able to land first.

@RalfJung
Copy link
Member

RalfJung commented Mar 7, 2025 via email

@onur-ozkan
Copy link
Member

onur-ozkan commented Mar 7, 2025

@onur-ozkan Can you suggest any change I can make to #135881 to get it to pass github CI? (Otherwise, this issue is one of only two things preventing stabilization of arbitrary self types). I don't think #138081 is a similar example, because the compiler changes do not require corresponding stdlib changes. In the case of #135881, they do.

Just to confirm could you share a small reproduction case for me to test? If it's a bootstrap issue, it should be easy to reproduce with minimal changes to the compiler and library. I would like to fix the problem over the weekend if it exists (although according to Ralf it may not be a bootstrap issue at all).

It's hard to review large diffs for cfg bootstrap bugs so I prefer working with smaller diffs for reproduction.

@RalfJung
Copy link
Member

RalfJung commented Mar 7, 2025

The PR I mentioned landed shortly after #137215 so should already be affected by any problems that causes.

@adetaylor
Copy link
Contributor Author

Thanks for the comments folks. I was hoping you'd immediately be able to say "aha, this is caused by XYZ, do ABC".

If not: I think the best way forward is to spend more time trying to bisect this, to find out what other recent Rust changes have altered things. As I said in my initial report there do seem to be multiple factors involved, not just this stage transition (i.e. I agree with @RalfJung ). Bisecting this proved to be pretty fiddly due to the need to apply various different versions of the patch to different revisions, etc. But I think another day or so of fiddling around might get a reasonable bisection script working.

Unfortunately I can't promise that I'll be the one to do that further investigation/bisection/minimization because I'm just changing jobs, and I don't want to make any assumptions that my new employer is supportive of this sort of work. I had really, really hoped to get #135881 landed before my personal circumstances changed, but this issue and a couple of other things conspired against it.

Feel free to close this github issue meanwhile, it may not be useful.

@onur-ozkan onur-ozkan added C-discussion Category: Discussion or questions that doesn't represent real issues. E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status and removed C-bug Category: This is a bug. labels Mar 7, 2025
@adetaylor
Copy link
Contributor Author

Perhaps this might help.

@adetaylor
Copy link
Contributor Author

Thanks to @compiler-errors the right permutation of bootstraps has been achieved over on #135881 and I'm going to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
Development

No branches or pull requests

4 participants