-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose more of the matching interface in Ruby #119
Labels
Comments
It's also worth nothing that Google's own documentation for RE2 only focusses on |
mudge
added a commit
that referenced
this issue
Nov 30, 2023
GitHub: #119 Add new options to `RE2::Regexp#match` that expose the underlying capabilities of RE2's Match function: * anchor: specifying whether a match should be unanchored (the default), anchored to the start of the text or anchored to both ends * startpos: the offset at which to start matching (defaults to the start of the text) * submatches: the number of submatches to extract (defaults to the number of capturing groups in the pattern) We keep compatibility with the previous API by still accepting a number of submatches as the second argument to match. With these new options in place, we can now offer a higher-level `RE2::Regexp#full_match` and `RE2::Regexp#partial_match` API to match RE2's own. Note we don't actually use the underlying `FullMatchN` or `PartialMatchN` functions as we need to use `Match`'s behaviour of returning the overall match first before any extracted submatches. The plan is to then heavily promote these two methods over the lower-level `match`.
mudge
added a commit
that referenced
this issue
Nov 30, 2023
GitHub: #119 Add new options to `RE2::Regexp#match` that expose the underlying capabilities of RE2's Match function: * anchor: specifying whether a match should be unanchored (the default), anchored to the start of the text or anchored to both ends * startpos: the offset at which to start matching (defaults to the start of the text) * submatches: the number of submatches to extract (defaults to the number of capturing groups in the pattern) We keep compatibility with the previous API by still accepting a number of submatches as the second argument to match. With these new options in place, we can now offer a higher-level `RE2::Regexp#full_match` and `RE2::Regexp#partial_match` API to match RE2's own. Note we don't actually use the underlying `FullMatchN` or `PartialMatchN` functions as we need to use `Match`'s behaviour of returning the overall match first before any extracted submatches. The plan is to then heavily promote these two methods over the lower-level `match`.
mudge
added a commit
that referenced
this issue
Dec 1, 2023
GitHub: #119 Expose RE2::Match()'s endpos argument in Ruby so users can specify an offset at which to stop matching. Note that old versions of RE2 don't accept an endpos argument when matching so we explicitly detect this and raise an exception when attempting to pass it to a version that doesn't support it.
mudge
added a commit
that referenced
this issue
Dec 1, 2023
GitHub: #119 Add new options to `RE2::Regexp#match` that expose the underlying capabilities of RE2's Match function: * anchor: specifying whether a match should be unanchored (the default), anchored to the start of the text or anchored to both ends * startpos: the offset at which to start matching (defaults to the start of the text) * submatches: the number of submatches to extract (defaults to the number of capturing groups in the pattern) We keep compatibility with the previous API by still accepting a number of submatches as the second argument to match. With these new options in place, we can now offer a higher-level `RE2::Regexp#full_match` and `RE2::Regexp#partial_match` API to match RE2's own. Note we don't actually use the underlying `FullMatchN` or `PartialMatchN` functions as we need to use `Match`'s behaviour of returning the overall match first before any extracted submatches. The plan is to then heavily promote these two methods over the lower-level `match`.
mudge
added a commit
that referenced
this issue
Dec 1, 2023
GitHub: #119 Expose RE2::Match()'s endpos argument in Ruby so users can specify an offset at which to stop matching. Note that old versions of RE2 don't accept an endpos argument when matching so we explicitly detect this and raise an exception when attempting to pass it to a version that doesn't support it.
Version 2.5.0 now exposes the full underlying |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When I first wrote this gem about a decade(!) ago, I naïvely intended it to be a drop-in replacement for Ruby’s
Regexp
standard library. However, RE2 not only doesn’t have the same syntax as Ruby’s regular expressions but it has its own unique capabilities that we’re not taking advantage of by hiding it behind a restrictive Ruby API.Already,
RE2::Regexp#match
has some poorly documented functionality that is unique to RE2: the ability to specify the exact number of submatches when performing a match which has a significant effect on performance. This should not only be better explained but be a core part of the API along with the other arguments toMatch
:startpos
,endpos
(not available on all versions of RE2) andanchor
.This would also create a natural opportunity to introduce the higher-level
FullMatch
andPartialMatch
APIs.The text was updated successfully, but these errors were encountered: