-
Notifications
You must be signed in to change notification settings - Fork 576
feat(tdigest): add the support of TDIGEST.REVRANK command #3130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
donghao526
wants to merge
48
commits into
apache:unstable
Choose a base branch
from
donghao526:feature/tdigest-revrank
base: unstable
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+328
−24
Open
Changes from 39 commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
03b69e1
feat: impl tdigest.revrank
donghao526 97df4a1
feat: impl tdigest.revrank
donghao526 70a39d3
feat: impl tdigest.revrank
donghao526 dde8410
feat: impl tdigest.revrank
donghao526 0d3e9cc
feat: impl tdigest.revrank
donghao526 bb172a8
test: add unit test for tdigest.revrank
donghao526 a64add4
test: add unit test for tdigest.revrank
donghao526 3954b1f
Merge branch 'unstable' into feature/tdigest-revrank
PragmaTwice 05d1202
add golang test cases for tdigest.revrank
donghao526 f688e14
add golang test cases for tdigest.revrank
donghao526 8bcad0f
add golang test cases for tdigest.revrank
donghao526 46ac984
add golang test cases for tdigest.revrank
donghao526 495e072
add golang test cases for tdigest.revrank
donghao526 2b6785d
add golang test cases for tdigest.revrank
invalid-email-address 3af3b54
feat: impl tdigest.revrank
invalid-email-address f3d85d3
Merge branch 'feature/tmp' into feature/tdigest-revrank
invalid-email-address e68689d
Merge branch 'unstable' into feature/tdigest-revrank
donghao526 eb8674f
feat: impl tdigest.revrank
invalid-email-address c70f410
Merge branch 'feature/tmp' into feature/tdigest-revrank
invalid-email-address b991d0d
Merge branch 'unstable' into feature/tdigest-revrank
donghao526 4c9a41d
Merge branch 'feature/tdigest-revrank' of github.com:donghao526/kvroc…
invalid-email-address 543fda0
fix(replication): Fix Seg Fault On Signal When Replication is Enabled…
zhixinwen e0d39a7
chore(.asf.yaml): make 2.13 a protected branches (#3129)
PragmaTwice a4ed14c
feat(ts): Add support for data writing and `TS.CREATE`, `TS.ADD/MADD`…
yezhizi 9d6c532
feat(ts): Add `TS.INFO` command (#3133)
yezhizi ff658f8
chore(.asf.yaml): enable auto merge and disable wiki (#3137)
PragmaTwice 53e82f8
chore: remove unused `autoResizeBlockAndSST` method and config (#3136)
jonahgao 6df3309
feat(scripting): support strict key-accessing mode for lua scripting …
PragmaTwice 201afed
feat(Dockerfile): add a UID for the user in the container (#3138)
SpecLad 0851c22
feat(ts): Add data query support and `TS.RANGE` command (#3140)
yezhizi c7ed36f
feat(ts): Add `TS.GET` command (#3142)
yezhizi 3a898fe
chore(config): enable `level_compaction_dynamic_level_bytes` by defau…
jonahgao 3711578
perf(storage): eliminate unnecessary `rocksdb::DB::ListColumnFamilies…
jonahgao 4b4f684
fix(scan): pattern-based SCAN iterations may skip remaining keys (#3146)
sryanyuan bd268b4
style: add some comments on TDigestRank
donghao526 8e6a7f9
Merge branch 'feature/tdigest-revrank' of github.com:donghao526/kvroc…
donghao526 6662240
refactor: remove commented code
donghao526 e7f06a2
style: format code
donghao526 367981c
Merge branch 'unstable' into feature/tdigest-revrank
LindaSummer 4b8cd6a
Merge branch 'unstable' into feature/tdigest-revrank
donghao526 07836fd
feat: sort the input using map in revrank
donghao526 f44bc56
Merge branch 'feature/tdigest-revrank' of github.com:donghao526/kvroc…
donghao526 2aded75
Merge branch 'unstable' into feature/tdigest-revrank
donghao526 f4a9c53
feat: add the support of TDIGEST.REVRANK command
donghao526 5023de8
Merge branch 'unstable' into feature/tdigest-revrank
donghao526 e3629d9
feat: add the support of TDIGEST.REVRANK command
donghao526 ae05623
fix: fix format
donghao526 0cf8c8a
fix: fix clang-tidy
donghao526 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,6 +22,7 @@ | |
|
|
||
| #include <fmt/format.h> | ||
|
|
||
| #include <numeric> | ||
| #include <vector> | ||
|
|
||
| #include "common/status.h" | ||
|
|
@@ -150,3 +151,74 @@ inline StatusOr<double> TDigestQuantile(TD&& td, double q) { | |
| diff /= (lc.weight / 2 + rc.weight / 2); | ||
| return Lerp(lc.mean, rc.mean, diff); | ||
| } | ||
|
|
||
| template <typename TD> | ||
| inline Status TDigestRank(TD&& td, const std::vector<double>& inputs, std::vector<int>& result) { | ||
|
||
| std::vector<size_t> indices(inputs.size()); | ||
| std::iota(indices.begin(), indices.end(), 0); | ||
| std::sort(indices.begin(), indices.end(), [&inputs](size_t a, size_t b) { return inputs[a] < inputs[b]; }); | ||
|
|
||
| result.resize(inputs.size()); | ||
| size_t i = indices.size(); | ||
| double cumulative_weight = 0; | ||
|
|
||
| // handle inputs larger than maximum | ||
| while (i > 0 && inputs[indices[i - 1]] > td.Max()) { | ||
| result[indices[i - 1]] = -1; | ||
| i--; | ||
| } | ||
|
|
||
| // reverse iterate through centroids and calculate reverse rank for each input | ||
| auto iter = td.End(); | ||
| while (i > 0) { | ||
| auto centroid = GET_OR_RET(iter->GetCentroid()); | ||
|
|
||
| if (centroid.mean > inputs[indices[i - 1]]) { | ||
| // mean > input, accumulate weight and move to prev centroid | ||
| cumulative_weight += centroid.weight; | ||
| } else if (centroid.mean == inputs[indices[i - 1]]) { | ||
| // mean == input, calculate reverse rank with half weight of current centroid | ||
| cumulative_weight += centroid.weight; | ||
| auto current_mean = centroid.mean; | ||
| auto current_mean_cumulative_weight = cumulative_weight + centroid.weight / 2; | ||
|
|
||
| // handle all the prev centroids which has the same mean | ||
| while (!iter->IsAtBegin() && iter->Prev()) { | ||
| auto next_centroid = GET_OR_RET(iter->GetCentroid()); | ||
| if (current_mean != next_centroid.mean) { | ||
| // move back to the last equal centroid, because we will process it in the next loop | ||
| iter->Next(); | ||
| break; | ||
| } | ||
| current_mean_cumulative_weight += centroid.weight / 2; | ||
| cumulative_weight += centroid.weight; | ||
| } | ||
|
|
||
| // assign the reverse rank for the inputs[indices[i - 1]] | ||
| result[indices[i - 1]] = static_cast<int>(current_mean_cumulative_weight); | ||
| i--; | ||
|
|
||
| // handle the prev inputs which has the same value | ||
| while ((i > 0) && (inputs[indices[i]] == inputs[indices[i - 1]])) { | ||
donghao526 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| result[indices[i - 1]] = result[indices[i]]; | ||
| i--; | ||
| } | ||
| } else { | ||
| // mean < input, calculate reverse rank | ||
| result[indices[i - 1]] = static_cast<int>(cumulative_weight); | ||
| i--; | ||
| } | ||
|
|
||
| if (iter->IsAtBegin()) { | ||
| break; | ||
| } | ||
| iter->Prev(); | ||
| } | ||
|
|
||
| // handle inputs less than minimum | ||
| while (i > 0) { | ||
| result[indices[i - 1]] = static_cast<int>(td.TotalWeight()); | ||
| i--; | ||
| } | ||
| return Status::OK(); | ||
| } | ||
donghao526 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.