-
Notifications
You must be signed in to change notification settings - Fork 0
MOD-10359 Implement shard window ratio optimization #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 6383_base
Are you sure you want to change the base?
Conversation
* RED-162822 Implement shard window ratio optimization
- Add shard window ratio parameter to reduce coordinator workload
- Implement effectiveK calculation based on ratio and shard count
- Support both FT.SEARCH and FT.AGGREGATE operations
* memory fixes
* initial imp
* replace substring
* move params before kn ctx parsing
remove shardWindowRatio from HybridIteratorParams
* fix shard_window_ratio pytests
add ranom vector
revery unnecessary
change error message
fix test_cpp_index
* c tests
add k as param to profile pytest
* remove release heap
* small copilot recommendations
* back header
* remove todo
* 1. **Simplified modifyKNNCommand API**:
2. **Unified K token tracking**:
- Replaced separate `k_literal_pos/k_literal_len` fields with unified
`k_token_pos/k_token_len` in VectorQuery
- when k is given as param we modify the query string instead of chnaing the param value
- Handles both literal K ("KNN 50") and parameter K ("KNN $k") cases
3. **Optimized string replacement**:
- Enhanced `MRCommand_ReplaceArgSubstring` with space-padding optimization
5. **Enhanced test coverage**:
- Added unit tests for MRCommand substring replacement functions
- Added edge case test for K=0 scenarios in pytest
- Tests cover both optimization and fallback code paths
* Update shard window ratio tests for refactored API
- **Simplified test helper**: Updated runModifyKNNTest to use k_token_pos/k_token_len instead of separate literal/parameter handling
- **Enhanced parameter testing**: Added test with longer parameter name ($k_costume) to verify exact string matching
- **Space-padding validation**: Tests now verify space-padding optimization in query string modifications
- **Added calculateEffectiveK tests**: Comprehensive unit tests for edge cases including k=0, ratio comparisons, and rounding behavior
- **Improved test coverage**: Tests both reallocation and optimization code paths
Tests validate the new unified token tracking approach and performance optimizations.
* fix type
* move special_case_ctx.h to coord dir
---------
Co-authored-by: meiravgri <[email protected]>
| #define DEFAULT_BG_OOM_PAUSE_TIME_BEFOR_RETRY 5 | ||
| #define DEFAULT_INDEXER_YIELD_EVERY_OPS 1000 | ||
| #define DEFAULT_SHARD_WINDOW_RATIO 1.0 | ||
| #define MIN_SHARD_WINDOW_RATIO 0.0 // Exclusive minimum (must be > 0.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Handle KNN with shard ratio optimization for both multi-shard and standalone | ||
| if (knnCtx) { | ||
| KNNVectorQuery *knn_query = &knnCtx->knn.queryNode->vn.vq->knn; | ||
| double ratio = knn_query->shardWindowRatio; | ||
|
|
||
| if (ratio < MAX_SHARD_WINDOW_RATIO) { | ||
| // Apply optimization only if ratio is valid and < 1.0 (ratio = 1.0 means no optimization) | ||
| // Calculate effective K based on deployment mode | ||
| size_t numShards = GetNumShards_UnSafe(); | ||
| size_t effectiveK = calculateEffectiveK(knn_query->k, ratio, numShards); | ||
|
|
||
| // Modify the command to replace KNN k (shards will ignore $SHARD_K_RATIO) | ||
| modifyKNNCommand(xcmd, 2 + profileArgs, effectiveK, knnCtx->knn.queryNode->vn.vq); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
This block of logic for handling the shard window ratio is very similar to the logic in src/module.c inside the prepareCommand function (lines 3298-3318). To avoid code duplication and improve maintainability, consider refactoring this into a shared helper function.
The implementation in module.c is slightly more optimized as it includes a check if (knn_query->k == effectiveK) break; to avoid unnecessary command modification. This check should be included in the shared function.
| #define VECSIM_EPSILON "EPSILON" | ||
| #define VECSIM_HYBRID_POLICY "HYBRID_POLICY" | ||
| #define VECSIM_BATCH_SIZE "BATCH_SIZE" | ||
| #define VECSIM_SHARD_WINDOW_RATIO "SHARD_WINDOW_RATIO" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement Shard Window Ratio Optimization for KNN Queries
This pull request introduces a new optimization for K-Nearest Neighbors (KNN) vector similarity queries in sharded environments. It adds a
shard_k_ratioattribute, allowing users to specify a ratio that influences the number of results each shard processes and returns to the coordinator. This aims to reduce the coordinator's workload and improve overall query performance by limiting the data transferred from shards.The core logic involves calculating an 'effective K' per shard using the formula
max(top_k/#shards, ceil(top_k × ratio)). The PR modifies internal command handling to dynamically adjust the K value in the query string sent to individual shards. It also includes an optimized string replacement mechanism to efficiently modify query arguments, handling both literal and parameterized K values.Key Changes
• Introduced
shard_k_ratioas a new query attribute forKNNvector queries.• Implemented
calculateEffectiveKfunction to determine the optimal number of results each shard should return based on the original K, the ratio, and the number of shards.• Modified
modifyKNNCommandto dynamically rewrite the K value in the query string sent to shards, supporting both literal (e.g.,KNN`` 50`) and parameterized (e.g.,KNN$k`) K values. • EnhancedMRCommand_ReplaceArgSubstringwith an optimization for in-place string replacement (padding with spaces) when the new substring is shorter or equal in length, avoiding reallocations. • Updated query parsing (`src/query.c`) to store the position and length of the K token for efficient modification. • Integrated the optimization into distributed ```FT.SEARCH``` and ```FT.``AGGREGATE``` command processing.• Added comprehensive C and Python unit/integration tests covering parameter validation, effective K calculation, command modification, and various query scenarios (literal K, parameterized K, hybrid queries, edge cases like K=0, and insufficient documents per shard).
Affected Areas
• Query parsing and
ASTconstruction (src/query.c,src/query_node.h)• Vector index query structures (
src/vector_index.h)• Distributed command handling and modification (
src/module.c,src/coord/dist_aggregate.c,src/coord/rmr/command.c,src/coord/rmr/command.h)• Configuration definitions (
src/config.h)• Testing infrastructure (new
tests/ctests/coord_tests/test_shard_window_ratio.c,tests/pytests/test_shard_window_ratio.py, and updates totests/ctests/coord_tests/test_command.c)This summary was automatically generated by @propel-code-bot