From b615c2b9b65676f5a4c92f2500deef3ee9f8f808 Mon Sep 17 00:00:00 2001 From: Lukas Kalbertodt Date: Fri, 4 Jun 2021 20:13:08 +0200 Subject: [PATCH 1/2] Clarify "string continue" for (byte) string literals The previous version just said "whitespace at the beginning of the next line is ignored", but that is not quite correct. Currently, exactly four characters are ignored in that position. This is different from the definition of `char::is_whitespace` and `char::is_ascii_whitespace`. Additionally "at the beginning of the next line" is confusing as additional \n are also ignored. https://github.com/rust-lang/rust/blob/595088d602049d821bf9a217f2d79aea40715208/compiler/rustc_lexer/src/unescape.rs#L281-L287 https://github.com/rust-lang/rust/blob/595088d602049d821bf9a217f2d79aea40715208/compiler/rustc_lexer/src/unescape.rs#L300-L307 --- src/tokens.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/tokens.md b/src/tokens.md index 15d8468a0..8f2063656 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -153,16 +153,21 @@ which must be _escaped_ by a preceding `U+005C` character (`\`). Line-breaks are allowed in string literals. A line-break is either a newline (`U+000A`) or a pair of carriage return and newline (`U+000D`, `U+000A`). Both byte sequences are normally translated to `U+000A`, but as a special exception, -when an unescaped `U+005C` character (`\`) occurs immediately before the -line-break, then the `U+005C` character, the line-break, and all whitespace at the -beginning of the next line are ignored. Thus `a` and `b` are equal: +when an unescaped `U+005C` character (`\`) occurs immediately before a line +break, then the line break character(s), and all immediately following +` ` (`U+0020`), `\t` (`U+0009`), `\n` (`U+000A`) and `\r` (`U+0000D`) characters +are ignored. Thus `a`, `b` and `c` are equal: ```rust let a = "foobar"; let b = "foo\ bar"; +let c = "foo\ -assert_eq!(a,b); + bar"; + +assert_eq!(a, b); +assert_eq!(b, c); ``` #### Character escapes From efc277f7de99d232b33d279605a10eeecf928683 Mon Sep 17 00:00:00 2001 From: Lukas Kalbertodt Date: Mon, 13 Jun 2022 08:14:31 +0200 Subject: [PATCH 2/2] Add note to line continuation section about confusing behavior --- src/tokens.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/tokens.md b/src/tokens.md index 8f2063656..5e7a59157 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -170,6 +170,11 @@ assert_eq!(a, b); assert_eq!(b, c); ``` +> Note: Rust skipping additional newlines (like in example `c`) is potentially confusing and +> unexpected. This behavior may be adjusted in the future. Until a decision is made, it is +> recommended to avoid relying on this, i.e. skipping multiple newlines with line continuations. +> See [this issue](https://github.com/rust-lang/reference/pull/1042) for more information. + #### Character escapes Some additional _escapes_ are available in either character or non-raw string