Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh(php): Tidy up regex rules #25

Merged
merged 2 commits into from
Apr 5, 2024
Merged

Conversation

davidhcefx
Copy link
Collaborator

@davidhcefx davidhcefx commented Mar 7, 2024

Why

I once found rules like

[^a-z0-9_-]{1}(true|false)[^a-z0-9_-]{1}

Notice that it enclosed (true|false) by a pair of [^a-z0-9_-]{1}, whose purpose is to detect word boundaries. That is, true should not be part of another word. However, detecting boundaries by this is not clever and make things more complicated. It should use \< and \> instead.

How

  • Use \< and \> to match word boundaries, instead of [^a-z0-9_-]{1}
  • Fix the OR operator || being highlighted twice.
  • Fix cannot include escapes in quoted strings.
  • Fix broken number highlighting, which caused variable names unable of containing digits.
    • eg. $num123 in the screenshot
  • Make inline comments regex more compact.
  • Variables: Remove parts that does not seem relevant

Before

After

- Use `\<` and `\>` to match word boundaries, instead of `[^a-z0-9_-]{1}`
- Fix the OR operator `||` being highlighted twice.
- Fix cannot include escapes in quoted strings.
- Fix redundant highlights to HEREDOC.
- Make inline comments regex more compact.
- Variables: Remove parts that does not seem relevant
color brightblue "[a-zA-Z0-9_]+:"
# Variables
color green "\$[a-zA-Z_0-9$]*|[=!<>]"
color green "\->[a-zA-Z_0-9$]*|[=!<>]"
Copy link
Collaborator Author

@davidhcefx davidhcefx Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Notice here that I removed $ from [a-zA-Z_0-9$], because I'm not sure why do the author think that there can be $ within the variable name? To my understandings, PHP doesn't allow variable like $nam$e.

  2. I also removed [=!<>], because I not sure why can symbols like =, !, < and > resemble a variable in PHP? Perhaps the author want to highlight things like $array[0] ?

color red "(\"[^\"]*\")"
color red "\"([^"\]|\\.)*\""
# Single quoted string
color red "'([^'\]|\\.)*'"
Copy link
Collaborator Author

@davidhcefx davidhcefx Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- The number constant regex also seems broken
# Heredoc (typically ends with a semicolon).
color red start="<<<['\"]?[A-Z][A-Z0-9_]*['\"]?" end="^[A-Z][A-Z0-9_]*;"
color red start="<<<[\"]?[A-Z][A-Z0-9_]*[\"]?" end="^[A-Z][A-Z0-9_]*;"
Copy link
Collaborator Author

@davidhcefx davidhcefx Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed single quotes from Heredoc, because <<<'EOF' will been highlighted in line 64.

@galenguyer
Copy link
Owner

Love this! Thank you for all your work here!

@galenguyer galenguyer merged commit c31b782 into galenguyer:master Apr 5, 2024
@davidhcefx davidhcefx deleted the php-tidy branch April 5, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants