-
-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Expose more of RE2's matching interface
GitHub: #119 Add new options to `RE2::Regexp#match` that expose the underlying capabilities of RE2's Match function: * anchor: specifying whether a match should be unanchored (the default), anchored to the start of the text or anchored to both ends * startpos: the offset at which to start matching (defaults to the start of the text) * submatches: the number of submatches to extract (defaults to the number of capturing groups in the pattern) We keep compatibility with the previous API by still accepting a number of submatches as the second argument to match. With these new options in place, we can now offer a higher-level `RE2::Regexp#full_match` and `RE2::Regexp#partial_match` API to match RE2's own. Note we don't actually use the underlying `FullMatchN` or `PartialMatchN` functions as we need to use `Match`'s behaviour of returning the overall match first before any extracted submatches. The plan is to then heavily promote these two methods over the lower-level `match`.
- Loading branch information
Showing
4 changed files
with
337 additions
and
39 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,5 +10,6 @@ | |
require 're2.so' | ||
end | ||
|
||
require "re2/regexp" | ||
require "re2/scanner" | ||
require "re2/version" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
module RE2 | ||
class Regexp | ||
# Match the pattern against any substring of the given +text+ and return | ||
# either a boolean (if no submatches are required) or a {RE2::MatchData} | ||
# instance with the specified number of submatches (defaults to the total | ||
# number of capturing groups). | ||
# | ||
# The number of submatches has a significant impact on performance: requesting | ||
# one submatch is much faster than requesting more than one and requesting | ||
# zero submatches is faster still. | ||
# | ||
# @param [String] text the text to search | ||
# @param [Hash] options the options with which to perform the match | ||
# @option options [Integer] :submatches how many submatches to extract (0 | ||
# is fastest), defaults to the total number of capturing groups | ||
# @return [RE2::MatchData] if extracting any submatches | ||
# @return [Boolean] if not extracting any submatches | ||
# @raise [ArgumentError] if given a negative number of submatches | ||
# @raise [NoMemoryError] if there was not enough memory to allocate the | ||
# matches | ||
# @raise [TypeError] if given non-numeric submatches or non-hash options | ||
# @example | ||
# r = RE2::Regexp.new('w(o)(o)') | ||
# r.partial_match('woot') | ||
# # => #<RE2::MatchData "woo" 1:"o" 2:"o"> | ||
# r.partial_match('woot', submatches: 1) # => #<RE2::MatchData "woo" 1:"o"> | ||
# r.partial_match('woot', submatches: 0) # => true | ||
def partial_match(text, options = {}) | ||
match(text, Hash(options).merge(anchor: :unanchored)) | ||
end | ||
|
||
# Match the pattern against the given +text+ exactly and return either a | ||
# boolean (if no submatches are required) or a {RE2::MatchData} instance | ||
# with the specified number of submatches (defaults to the total number of | ||
# capturing groups). | ||
# | ||
# The number of submatches has a significant impact on performance: requesting | ||
# one submatch is much faster than requesting more than one and requesting | ||
# zero submatches is faster still. | ||
# | ||
# @param [String] text the text to search | ||
# @param [Hash] options the options with which to perform the match | ||
# @option options [Integer] :submatches how many submatches to extract (0 | ||
# is fastest), defaults to the total number of capturing groups | ||
# @return [RE2::MatchData] if extracting any submatches | ||
# @return [Boolean] if not extracting any submatches | ||
# @raise [ArgumentError] if given a negative number of submatches | ||
# @raise [NoMemoryError] if there was not enough memory to allocate the | ||
# matches | ||
# @raise [TypeError] if given non-numeric submatches or non-hash options | ||
# @example | ||
# r = RE2::Regexp.new('w(o)(o)') | ||
# r.full_match('woo') | ||
# # => #<RE2::MatchData "woo" 1:"o" 2:"o"> | ||
# r.full_match('woo', submatches: 1) # => #<RE2::MatchData "woo" 1:"o"> | ||
# r.full_match('woo', submatches: 0) # => true | ||
# r.full_match('woot') # => nil | ||
def full_match(text, options = {}) | ||
match(text, Hash(options).merge(anchor: :anchor_both)) | ||
end | ||
end | ||
end |
Oops, something went wrong.