-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: New overrides for string.Replace() #110912
Comments
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions |
Tagging subscribers to this area: @dotnet/area-system-runtime |
If the idea is to restrict certain words, wouldn't |
|
As I specifically addressed, it's a way that I would not recommend to use and I would not personally use that way, but its just a way they could use it if that's what they wanted. There are lots of other use-cases for the overrides. |
I realize that you can create extension methods. The provided code contain only extension methods. Those code examples are to serve as an example of how it could be accomplished as I've tried to thoroughly test each of them. I will be implementing |
Using BenchmarkDotNet:
It does look to add a little bit more overhead, mostly for the regex part which tbh was expected. |
Creating a Regex on the fly in this API will always perform worse than leveraging the built-in regex source generator. |
Background and motivation
Currently, when we need to perform replacements in strings, we rely on
.Replace()
, which requires either a specific string value (string.Replace()
) or a regular expression (new Regex.Replace()
). While this works, it feels like there would be a benefit to having additional overrides forstring.Remove()
to handle these scenarios, rather than just the current option that requires specifying a starting and ending index. While the existing approach has its use cases, removing parts of strings—especially with patterns or substrings—feels like a more common need and would provide more flexibility.API Proposal
Below is an extension method class to server as a working example, but of course it's up to the team's interpretation as to how would be the best to implement it.
API Usage
There are many situations where we need to sanitize a string for various reasons. Instead of writing and maintaining minimal but still duplicated code for sanitization, it would be far more efficient to simply "Remove" whatever values we don't want. The ability to provide either a single
Regex
orstring
, or a collection ofRegex
orstring
values to remove from the source string, would simplify this process. This feature would be particularly useful for quick and easy sanitization of usernames, passphrases, or any other user input.For example, Ally Financial provides an API endpoint (https://secure.ally.com/assets/json/invalid-strings.json) that returns a list of words or phrases they do not allow in usernames or passphrases. Note: This list includes terms that are explicitly filtered for user-provided content and may contain offensive or vulgar language. This API is called after a user successfully signs in and is taken to their dashboard. The returned list is then used to validate or sanitize user-provided input (for very good reason). Instead of manually implementing and maintaining duplicate logic to filter out these words, with the inclusion of these additional overrides, they could simply pass in their list of restricted terms and proceed with their process. This approach also makes it easier to remove characters that could potentially expose the system to injection vulnerabilities.
For instance, if Ally didn’t want users to manually clean up their passphrases, they could sanitize a phrase like:
"ThisBadWordIsOtherBadWordMyLastBadWordPassword!"
They could then call:
Or, for case-insensitive matching:
The result would be the sanitized password: ThisIsMyPassword!, which they could store securely. Then, during login, they could call the same function to sanitize the entered password (which may contain vulgar or restricted terms), and compare it to the sanitized stored password. This would allow the user to log in with their original password, while the system ensures it matches the sanitized version stored securely, keeping the process transparent to the customer. This approach simplifies string sanitization without duplicating logic. Note: I recognize that this example wouldn’t be a best practice, and I wouldn't recommend or use it myself. It simply serves to demonstrate one way they could utilize these new overrides.
Overall, this would be a valuable quality-of-life improvement and would better align the semantic meanings of "Replace" vs "Remove". Currently, "Replace" is used in cases where the intention is not to replace, but rather to remove content. Clarifying this distinction would make the API more intuitive and consistent with its intended behavior.
Alternative Designs
No response
Risks
Since these would be new overrides there shouldn't be any risk as the current
would still exist and be untouched.
The text was updated successfully, but these errors were encountered: