-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Handle negative search request nodes stats #19340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
❌ Gradle check result for 6560f5e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
6560f5e
to
751d5ac
Compare
❌ Gradle check result for 751d5ac: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
751d5ac
to
ed3c69e
Compare
❌ Gradle check result for ed3c69e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 65b5bc9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should fix the issue of negative stats itself, but node should not be going down due to serialization issue, so LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failures look related to me. I am wondering if this can break backward compatibility is some way:
[Test Result](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/) (14 failures / +14)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/10_basic/Nodes stats}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_10_basic_Nodes_stats_/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats filtered using both includes and excludes filters}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_20_response_filtering_Nodes_Stats_filtered_using_both_includes_and_excludes_filters_/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats with response filtering}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_20_response_filtering_Nodes_Stats_with_response_filtering_/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all include_segment_file_sizes}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_11_indices_metrics_Metric____all_include_segment_file_sizes_/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - indices _all}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_11_indices_metrics_Metric___indices__all_/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/10_basic/Nodes stats}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_10_basic_Nodes_stats__2/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats with response filtering}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_20_response_filtering_Nodes_Stats_with_response_filtering__2/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all include_segment_file_sizes}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_11_indices_metrics_Metric____all_include_segment_file_sizes__2/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/10_basic/Nodes stats}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_10_basic_Nodes_stats__3/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats with response filtering}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_20_response_filtering_Nodes_Stats_with_response_filtering__3/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all include_segment_file_sizes}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_11_indices_metrics_Metric____all_include_segment_file_sizes__3/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/10_basic/Nodes stats}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_10_basic_Nodes_stats__4/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats with response filtering}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_20_response_filtering_Nodes_Stats_with_response_filtering__4/)
[org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all include_segment_file_sizes}](https://build.ci.opensearch.org/job/gradle-check/64475/testReport/junit/org.opensearch.backwards/MixedClusterClientYamlTestSuiteIT/test__p0_nodes_stats_11_indices_metrics_Metric____all_include_segment_file_sizes__4/)
d8bcd50
to
092cb9d
Compare
❌ Gradle check result for 092cb9d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@jainankitk @sgup432 I revised the solution to avoid serde issues. No longer using |
092cb9d
to
cbc5507
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #19340 +/- ##
============================================
- Coverage 72.96% 72.92% -0.04%
- Complexity 70117 70125 +8
============================================
Files 5687 5687
Lines 321659 321669 +10
Branches 46525 46527 +2
============================================
- Hits 234684 234587 -97
- Misses 68041 68171 +130
+ Partials 18934 18911 -23 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
server/src/main/java/org/opensearch/index/search/stats/SearchStats.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/search/stats/SearchStats.java
Outdated
Show resolved
Hide resolved
cbc5507
to
fb422fd
Compare
❌ Gradle check result for fb422fd: TIMEOUT Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: David Zane <[email protected]>
fb422fd
to
8e664f8
Compare
❕ Gradle check result for 8e664f8: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Signed-off-by: David Zane <[email protected]>
Signed-off-by: David Zane <[email protected]>
Description
We identified a bug where negative search stats are causing an exception when processed using
VLong
in the following line:https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/search/stats/SearchStats.java#L89
The issue occurs because the VLong is used as a counter for tracking number of active phases. However, since there can be multiple paths for exiting a phase, the counters can be decremented more than once.
Solution
Check for negative values before read/write using
VLong
. If negative, write0
instead.I explored using
ZLong
, but this can cause serde issues between different OpenSearch versions.Related Issues
Resolves #16598
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.