Skip to content

Conversation

@ivanprado
Copy link
Contributor

Do not merge!

An alternative to #53

Existing rules as defined in the original requests are working. But now custom regexes can be used as well when required. The way they can be defined is by tuples containing the domain and a regex over the full URL.

For example, consider the following configuration:

OVERRIDES_1 = { ... }
OVERRIDES_2 = { ... }
{
"example.com/en_gb": OVERRIDES_1
("example.com", r"http://example.com/.*?product_id=[0-9]*.*): OVERRIDES_2
}

All URLs with product_id argument in the URL will go through OVERRIDES_2. Other URLs belonging to the subpath en_gb will go through OVERRIDES_1.

This brings the best of the two words: using simple domain_or_more rules can still be used, and they will cover most of the cases. If something more powerful is required, then you can use regex. The fact that we are partitioning by domain makes also efficient enough this new approach.

Note that in case of collusion of both rules (for example for url http://example.com/en_gb/product?product_id=23 in the former example) those defined as regex has priority. And the priority over different regexes is defined by the order in which the regexes are declared.

The priorities over the domain_or_more rules is defined according to the hierarchy (tld, domain, subdomain, path, etc), so the order in which they are defined is not affected.

There is a possibility: enrich a bit more the domain_or_more rules by allowing globbing. For example, why not allow to declare rules like example.com/product_id=?. Implementation-wise it is not very hard to do. @kmike what do you think?

Note that we are here allowing registering default POs by using the empty string. Theoretically, you can also register regex for the empty domain and then they will be applied for any URL. But I don't think this is something we should promote, as it is not a good practice from the point of view of performance.

TODO:

  • Testing regex priorities
  • Documentation
  • Separated registries per page object type
  • globbing?
  • Removing the old HierarchicalRegistry

@ivanprado ivanprado requested a review from kmike September 13, 2021 11:31
@kmike kmike requested a review from BurnzZ October 5, 2021 09:12
@kmike kmike changed the base branch from hierarchical_override to master October 11, 2021 18:17
@kmike kmike changed the base branch from master to hierarchical_override October 11, 2021 18:18
@ivanprado
Copy link
Contributor Author

Close in favour or #56

@ivanprado ivanprado closed this Dec 13, 2021
@BurnzZ BurnzZ deleted the hierarchichal_override_regex branch March 17, 2022 07:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants