Skip to content

Commit 083cbb4

Browse files
Kobzoltshepang
authored andcommitted
Add section about building an optimized version of rustc
1 parent 2ffa3f5 commit 083cbb4

File tree

2 files changed

+132
-0
lines changed

2 files changed

+132
-0
lines changed

src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
- [Building Documentation](./building/compiler-documenting.md)
1515
- [Rustdoc overview](./rustdoc.md)
1616
- [Adding a new target](./building/new-target.md)
17+
- [Optimized build](./building/optimized-build.md)
1718
- [Testing the compiler](./tests/intro.md)
1819
- [Running tests](./tests/running.md)
1920
- [Testing with Docker](./tests/docker.md)

src/building/optimized-build.md

+131
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Optimized build of the compiler
2+
3+
<!-- toc -->
4+
5+
There are multiple additional build configuration options and techniques that can used to compile a
6+
build of `rustc` that is as optimized as possible (for example when building `rustc` for a Linux
7+
distribution). The status of these configuration options for various Rust targets is tracked [here].
8+
This page describes how you can use these approaches when building `rustc` yourself.
9+
10+
[here]: https://github.com/rust-lang/rust/issues/103595
11+
12+
## Link-time optimization
13+
14+
Link-time optimization is a powerful compiler technique that can increase program performance. To
15+
enable (Thin-)LTO when building `rustc`, set the `rust.lto` config option to `"thin"`
16+
in `config.toml`:
17+
18+
```toml
19+
[rust]
20+
lto = "thin"
21+
```
22+
23+
> Note that LTO for `rustc` is currently supported and tested only for
24+
> the `x86_64-unknown-linux-gnu` target. Other targets *may* work, but no guarantees are provided.
25+
> Notably, LTO optimized `rustc` currently produces [miscompilations] on Windows.
26+
27+
[miscompilations]: https://github.com/rust-lang/rust/issues/109114
28+
29+
Enabling LTO on Linux has [produced] speed-ups by up to 10%.
30+
31+
[produced]: https://github.com/rust-lang/rust/pull/101403#issuecomment-1288190019
32+
33+
## Memory allocator
34+
35+
Using a different memory allocator for `rustc` can provide significant performance benefits. If you
36+
want to enable the `jemalloc` allocator, you can set the `rust.jemalloc` option to `true`
37+
in `config.toml`:
38+
39+
```toml
40+
[rust]
41+
jemalloc = true
42+
```
43+
44+
> Note that this option is currently only supported for Linux and macOS targets.
45+
46+
## Codegen units
47+
48+
Reducing the amount of codegen units per `rustc` crate can produce a faster build of the compiler.
49+
You can modify the number of codegen units for `rustc` and `libstd` in `config.toml` with the
50+
following options:
51+
52+
```toml
53+
[rust]
54+
codegen-units = 1
55+
codegen-units-std = 1
56+
```
57+
58+
## Instruction set
59+
60+
By default, `rustc` is compiled for a generic (and conservative) instruction set architecture
61+
(depending on the selected target), to make it support as many CPUs as possible. If you want to
62+
compile `rustc` for a specific instruction set architecture, you can set the `target_cpu` compiler
63+
option in `RUSTFLAGS`:
64+
65+
```bash
66+
$ RUSTFLAGS="-C target_cpu=x86-64-v3" x.py build ...
67+
```
68+
69+
If you also want to compile LLVM for a specific instruction set, you can set `llvm` flags
70+
in `config.toml`:
71+
72+
```toml
73+
[llvm]
74+
cxxflags = "-march=x86-64-v3"
75+
cflags = "-march=x86-64-v3"
76+
```
77+
78+
## Profile-guided optimization
79+
80+
Applying profile-guided optimizations (or more generally, feedback-directed optimizations) can
81+
produce a large increase to `rustc` performance, by up to 25%. However, these techniques are not
82+
simply enabled by a configuration option, but rather they require a complex build workflow that
83+
compiles `rustc` multiple times and profiles it on selected benchmarks.
84+
85+
There is a tool called `opt-dist` that is used to optimize `rustc` with [PGO] (profile-guided
86+
optimizations) and [BOLT] (a post-link binary optimizer) for builds distributed to end users. You
87+
can examine the tool, which is located in `src/tools/opt-dist`, and build a custom PGO build
88+
workflow based on it, or try to use it directly. Note that the tool is currently quite hardcoded to
89+
the way we use it in Rust's continuous integration workflows, and it might require some custom
90+
changes to make it work in a different environment.
91+
92+
[PGO]: https://doc.rust-lang.org/rustc/profile-guided-optimization.html
93+
94+
[BOLT]: https://github.com/llvm/llvm-project/blob/main/bolt/README.md
95+
96+
To use the tool, you will need to provide some external dependencies:
97+
98+
- A Python3 interpreter (for executing `x.py`).
99+
- Compiled LLVM toolchain, with the `llvm-profdata` binary. Optionally, if you want to use BOLT,
100+
the `llvm-bolt` and
101+
`merge-fdata` binaries have to be available in the toolchain.
102+
- Downloaded [Rust benchmark suite].
103+
104+
These dependencies are provided to `opt-dist` by an implementation of the [`Environment`] trait. You
105+
can either implement the trait for your custom environment, by providing paths to these dependencies
106+
in its methods, or reuse one of the existing implementations (currently, there is an implementation
107+
for Linux and Windows). If you want your environment to support BOLT, return `true` from
108+
the `supports_bolt` method.
109+
110+
Here is an example of how can `opt-dist` be used with the default Linux environment (it assumes that
111+
you execute the following commands on a Linux system):
112+
113+
1. Build the tool with the following command:
114+
```bash
115+
$ python3 x.py build tools/opt-dist
116+
```
117+
2. Run the tool with the `PGO_HOST` environment variable set to the 64-bit Linux target:
118+
```bash
119+
$ PGO_HOST=x86_64-unknown-linux-gnu ./build/host/stage0-tools-bin/opt-dist
120+
```
121+
Note that the default Linux environment expects several hardcoded paths to exist:
122+
- `/checkout` should contain a checkout of the Rust compiler repository that will be compiled.
123+
- `/rustroot` should contain the compiled LLVM toolchain (containing BOLT).
124+
- A Python 3 interpreter should be available under the `python3` binary.
125+
- `/tmp/rustc-perf` should contain a downloaded checkout of the Rust benchmark suite.
126+
127+
You can modify `LinuxEnvironment` (or implement your own) to override these paths.
128+
129+
[`Environment`]: https://github.com/rust-lang/rust/blob/65e468f9c259749c210b1ae8972bfe14781f72f1/src/tools/opt-dist/src/environment/mod.rs#L8-L7
130+
131+
[Rust benchmark suite]: https://github.com/rust-lang/rustc-perf

0 commit comments

Comments
 (0)