Skip to content

IsBlankString matcher is inconsistent with String.isBlank() #325

Open
@Thorn1089

Description

@Thorn1089

The naming of the matcher implies it would give consistent results with String.isBlank(). However, it does not. The matcher uses a regex with pattern \\s* which will only match ASCII whitespace characters. String.isBlank() eventually delegates to Character.isWhitespace() which checks for Unicode whitespace as well. See https://stackoverflow.com/a/57998178 for a detailed breakdown, summarized below:

But further, there are different definitions of white-space

Character.isWhitespace:

Determines if the specified character (Unicode code point) is white space according to Java. A character is a Java whitespace character if and only if it satisfies one of the following criteria:

It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F').
It is '\t', U+0009 HORIZONTAL TABULATION.
It is '\n', U+000A LINE FEED.
It is '\u000B', U+000B VERTICAL TABULATION.
It is '\f', U+000C FORM FEED.
It is '\r', U+000D CARRIAGE RETURN.
It is '\u001C', U+001C FILE SEPARATOR.
It is '\u001D', U+001D GROUP SEPARATOR.
It is '\u001E', U+001E RECORD SEPARATOR.
It is '\u001F', U+001F UNIT SEPARATOR.
the \s pattern (by default):

\s A whitespace character: [ \t\n\x0B\f\r]

I discovered this when using JUnit Quickcheck with an assumption to prevent handling whitespace strings of the form:
Assume.assumeThat(string, not(blankString()));
The assumption passed, and my code would check the string via String.isBlank() and fail the test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions