new lint: `char_indices_as_byte_indices` #13435

y21 · 2024-09-21T20:42:33Z

This adds a new lint that checks for uses of the .chars().enumerate() position in a context where a byte index is required and suggests changing it to use .char_indices() instead.

I'm planning to extend this lint to also detect uses of the position in iterator chains, e.g. s.chars().enumerate().for_each(|(i, _)| s.split_at(i));, but that's for another time

changelog: new lint: chars_enumerate_for_byte_indices

rustbot · 2024-09-21T20:42:38Z

r? @Centri3

rustbot has assigned @Centri3.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

y21

Some notes for reviewers

clippy_lints/src/loops/chars_enumerate_for_byte_indices.rs

y21 · 2024-09-21T20:47:27Z

tests/ui/chars_enumerate_for_byte_indices.rs

+        // can't use #[expect] here because the .fixed file will still have the attribute and create an
+        // unfulfilled expectation, but make sure lint level attributes work on the use expression:
+        #[allow(clippy::chars_enumerate_for_byte_indices)]
+        let _ = prim[..idx];


this is a fun one, I wonder if that's something that could be fixed in uitest, like removing #[expect] attributes in the .fixed file 🤔

y21 · 2024-09-21T20:49:47Z

clippy_lints/src/loops/mod.rs

+    /// ```
+    #[clippy::version = "1.83.0"]
+    pub CHARS_ENUMERATE_FOR_BYTE_INDICES,
+    correctness,


The pattern is technically fine if you know what your strings are (like the description mentions) so it's not always 'outright wrong' like the usual correctness lints, but the fix is also really simple and always applicable so 🤷‍♂️

As said on Zulip, we could say that bytes should always be used as a replacement if it expects ASCII. I don't think there are any downsides to that. I think blocking compilation on those cases is better than not doing so for where it could actually be a problem.

Boshen · 2024-09-26T08:45:55Z

Thank you for working on this.

The linter we are working on (oxlint) has encountered dozens of crashes because of this, and there were no ways to forbid such usages.

Centri3

This all looks fine to me. I'll open the FCP in a moment

bors · 2024-12-01T12:13:33Z

☔ The latest upstream changes (presumably 1f966e9) made this pull request unmergeable. Please resolve the merge conflicts.

Centri3

From the FCP: We could mention .bytes() in the description, but no issues

Centri3 · 2024-12-02T03:12:35Z

clippy_lints/src/loops/mod.rs

+    /// ```
+    #[clippy::version = "1.83.0"]
+    pub CHARS_ENUMERATE_FOR_BYTE_INDICES,
+    correctness,


As said on Zulip, we could say that bytes should always be used as a replacement if it expects ASCII. I don't think there are any downsides to that. I think blocking compilation on those cases is better than not doing so for where it could actually be a problem.

clippy_lints/src/loops/chars_enumerate_for_byte_indices.rs

y21 · 2024-12-03T03:52:59Z

Added a note for str::bytes and also renamed the lint to be slightly more general in case we have other sources of char indices in the future as Jarcho suggested on zulip. Left them as separate commits for now so it's easier to review but will squash once you think it's ok

Centri3

Looks good to me :)

Centri3 · 2024-12-05T10:16:06Z

clippy_lints/src/loops/char_indices_as_byte_indices.rs

+            "passing a character position to a method that expects a byte index"
+        },
+        ExprKind::Index(target, ..)
+            if is_string_like(cx.typeck_results().expr_ty_adjusted(target).peel_refs())


nit: I think the reason this isn't used in both arms is a bit non-obvious at first, especially if the comment on BYTE_INDEX_METHODS isn't seen. It could perhaps just point to what's already written there. That's about all.

clippy_lints/src/loops/char_indices_as_byte_indices.rs

Centri3 · 2025-02-14T14:03:24Z

Hey @y21, can you take a look at this again? Is there anything from my review that doesn't make sense/you're stuck on?

y21 · 2025-02-14T14:14:34Z

Oops, I got busy and forgot about this. I'll look at this again over the weekend!

rustbot assigned Centri3 Sep 21, 2024

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Sep 21, 2024

y21 commented Sep 21, 2024

View reviewed changes

Boshen mentioned this pull request Sep 26, 2024

Ban index methods on std::str::Chars oxc-project/oxc#6071

Closed

y21 force-pushed the chars_enumerate_for_byte_index branch from 97c4bfb to bf38bb1 Compare October 13, 2024 23:27

Centri3 approved these changes Nov 7, 2024

View reviewed changes

Centri3 approved these changes Dec 2, 2024

View reviewed changes

y21 force-pushed the chars_enumerate_for_byte_index branch from bf38bb1 to 4ac1ee8 Compare December 3, 2024 03:50

Centri3 approved these changes Dec 5, 2024

View reviewed changes

Centri3 mentioned this pull request Dec 16, 2024

literal_string_with_formatting_args ices on {…} #13838

Closed

y21 added 3 commits March 30, 2025 13:39

new lint: chars_enumerate_for_byte_indices

fb32aaf

add a note for str::bytes

32cf884

rename lint to char_indices_as_byte_indices

d5d2189

y21 force-pushed the chars_enumerate_for_byte_index branch from 4ac1ee8 to 56124b8 Compare March 30, 2025 11:54

add comment and .clone() test case

c739027

y21 force-pushed the chars_enumerate_for_byte_index branch from 56124b8 to c739027 Compare March 30, 2025 12:05

y21 changed the title ~~new lint: chars_enumerate_for_byte_indices~~ new lint: char_indices_as_byte_indices Mar 30, 2025

Centri3 added this pull request to the merge queue Mar 30, 2025

Merged via the queue into rust-lang:master with commit 4f58673 Mar 30, 2025
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new lint: `char_indices_as_byte_indices` #13435

new lint: `char_indices_as_byte_indices` #13435

y21 commented Sep 21, 2024 •

edited by Centri3

Loading

rustbot commented Sep 21, 2024

y21 left a comment

y21 Sep 21, 2024

y21 Sep 21, 2024 •

edited

Loading

Centri3 Dec 2, 2024

Boshen commented Sep 26, 2024

Centri3 left a comment

bors commented Dec 1, 2024

Centri3 left a comment

Centri3 Dec 2, 2024

y21 commented Dec 3, 2024

Centri3 left a comment

Centri3 Dec 5, 2024

Centri3 commented Feb 14, 2025

y21 commented Feb 14, 2025

new lint: char_indices_as_byte_indices #13435

new lint: char_indices_as_byte_indices #13435

Conversation

y21 commented Sep 21, 2024 • edited by Centri3 Loading

rustbot commented Sep 21, 2024

y21 left a comment

Choose a reason for hiding this comment

y21 Sep 21, 2024

Choose a reason for hiding this comment

y21 Sep 21, 2024 • edited Loading

Choose a reason for hiding this comment

Centri3 Dec 2, 2024

Choose a reason for hiding this comment

Boshen commented Sep 26, 2024

Centri3 left a comment

Choose a reason for hiding this comment

bors commented Dec 1, 2024

Centri3 left a comment

Choose a reason for hiding this comment

Centri3 Dec 2, 2024

Choose a reason for hiding this comment

y21 commented Dec 3, 2024

Centri3 left a comment

Choose a reason for hiding this comment

Centri3 Dec 5, 2024

Choose a reason for hiding this comment

Centri3 commented Feb 14, 2025

y21 commented Feb 14, 2025

new lint: `char_indices_as_byte_indices` #13435

new lint: `char_indices_as_byte_indices` #13435

y21 commented Sep 21, 2024 •

edited by Centri3

Loading

y21 Sep 21, 2024 •

edited

Loading