Skip to content

Commit 5a4e7ba

Browse files
authored
Merge pull request #286 from ehuss/c-string-literal
Add C-string literals.
2 parents 34fca48 + d5d8010 commit 5a4e7ba

File tree

2 files changed

+73
-0
lines changed

2 files changed

+73
-0
lines changed

src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,4 @@
3333
- [Reserving syntax](rust-2021/reserving-syntax.md)
3434
- [Warnings promoted to errors](rust-2021/warnings-promoted-to-error.md)
3535
- [Or patterns in macro-rules](rust-2021/or-patterns-macro-rules.md)
36+
- [C-string literals](rust-2021/c-string-literals.md)

src/rust-2021/c-string-literals.md

+72
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# C-string literals
2+
3+
## Summary
4+
5+
- Literals of the form `c"foo"` or `cr"foo"` represent a string of type [`&core::ffi::CStr`][CStr].
6+
7+
[CStr]: ../../core/ffi/struct.CStr.html
8+
9+
## Details
10+
11+
Starting with Rust 1.76, C-strings can be written using C-string literal syntax with the `c` or `cr` prefix.
12+
13+
Previously, it was challenging to properly produce a valid string literal that could interoperate with C APIs which terminate with a NUL byte.
14+
The [`cstr`] crate was a popular solution, but that required compiling a proc-macro which was quite expensive.
15+
Now, C-strings can be written directly using literal syntax notation, which will generate a value of type [`&core::ffi::CStr`][CStr] which is automatically terminated with a NUL byte.
16+
17+
```rust,edition2021
18+
# use core::ffi::CStr;
19+
20+
assert_eq!(c"hello", CStr::from_bytes_with_nul(b"hello\0").unwrap());
21+
assert_eq!(
22+
c"byte escapes \xff work",
23+
CStr::from_bytes_with_nul(b"byte escapes \xff work\0").unwrap()
24+
);
25+
assert_eq!(
26+
c"unicode escapes \u{00E6} work",
27+
CStr::from_bytes_with_nul(b"unicode escapes \xc3\xa6 work\0").unwrap()
28+
);
29+
assert_eq!(
30+
c"unicode characters αβγ encoded as UTF-8",
31+
CStr::from_bytes_with_nul(
32+
b"unicode characters \xce\xb1\xce\xb2\xce\xb3 encoded as UTF-8\0"
33+
)
34+
.unwrap()
35+
);
36+
assert_eq!(
37+
c"strings can continue \
38+
on multiple lines",
39+
CStr::from_bytes_with_nul(b"strings can continue on multiple lines\0").unwrap()
40+
);
41+
```
42+
43+
C-strings do not allow interior NUL bytes (such as with a `\0` escape).
44+
45+
Similar to regular strings, C-strings also support "raw" syntax with the `cr` prefix.
46+
These raw C-strings do not process backslash escapes which can make it easier to write strings that contain backslashes.
47+
Double-quotes can be included by surrounding the quotes with the `#` character.
48+
Multiple `#` characters can be used to avoid ambiguity with internal `"#` sequences.
49+
50+
```rust,edition2021
51+
assert_eq!(cr"foo", c"foo");
52+
// Number signs can be used to embed interior double quotes.
53+
assert_eq!(cr#""foo""#, c"\"foo\"");
54+
// This requires two #.
55+
assert_eq!(cr##""foo"#"##, c"\"foo\"#");
56+
// Escapes are not processed.
57+
assert_eq!(cr"C:\foo", c"C:\\foo");
58+
```
59+
60+
See [The Reference] for more details.
61+
62+
[`cstr`]: https://crates.io/crates/cstr
63+
[The Reference]: ../../reference/tokens.html#c-string-and-raw-c-string-literals
64+
65+
## Migration
66+
67+
Migration is only necessary for macros which may have been assuming a sequence of tokens that looks similar to `c"…"` or `cr"…"`, which previous to the 2021 edition would tokenize as two separate tokens, but in 2021 appears as a single token.
68+
69+
As part of the [syntax reservation] for the 2021 edition, any macro input which may run into this issue should issue a warning from the `rust_2021_prefixes_incompatible_syntax` migration lint.
70+
See that chapter for more detail.
71+
72+
[syntax reservation]: reserving-syntax.md

0 commit comments

Comments
 (0)