-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[Rule-based Auto-tagging] Add autotagging label resolving logic for multiple attributes #19424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Ruirui Zhang <[email protected]>
ad13630 to
a8b2860
Compare
|
❌ Gradle check result for a8b2860: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing the PR, few initial comments
| enum LogicalOperator { | ||
| /** | ||
| * Logical AND | ||
| */ | ||
| AND, | ||
| /** | ||
| * Logical OR | ||
| */ | ||
| OR | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we expecting anything other than AND/OR? If not, might be better to have method return boolean value, say isConjunction()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that will not be ideal since the return value might be little ambiguous in instructions after the method call e,g;
boolean isAnd = isConjuntion();
....
if (!isAnd) // this is ambiguos as this doesn't directly imply OR here vs LogicalOperator.OR
| * This helps in tie-breaking: values appearing earlier in the list (i.e., more specific matches) | ||
| * are considered better matches when resolving the final label. | ||
| */ | ||
| private final Map<String, Integer> firstOccurrenceIndex = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming this is for optimizing the lookup? Have we considered the latency impact without having this index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven’t measured the latency impact/ run latency tests yet, but expect this should make lookups faster. Without it, we would need to iterate through every element in the list to determine the earliest occurrence, which would be way less efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, for evaluating principal attribute, we should use exact match for username and role instead of prefix. The admins can create role mapping for specific user patterns (which supports regex, not just prefix) instead of working with prefix/patterns as part principal attribute in WLM. Hopefully, that should make FeatureValueResolver logic simpler and easier to follow.
We can always support that prefix based principal values, if there is strong ask for it in future, but unable to see that for now.
Description
In the past, the auto tagging label resolving logic is only suitable for single attribute evaluation. Since we're adding more attributes into the feature now (username, role, index_pattern), we need a more comprehensive logic to find the best suited rule and label.
Feature documentation: https://docs.opensearch.org/latest/tuning-your-cluster/availability-and-recovery/rule-based-autotagging/autotagging/
Main classes & functions introduced:
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.