MOD-10359 Implement shard window ratio optimization #7

tonyd3 · 2025-08-05T04:34:46Z

Implement Shard Window Ratio Optimization for KNN Queries

This pull request introduces a new optimization for K-Nearest Neighbors (KNN) vector similarity queries in sharded environments. It adds a shard_k_ratio attribute, allowing users to specify a ratio that influences the number of results each shard processes and returns to the coordinator. This aims to reduce the coordinator's workload and improve overall query performance by limiting the data transferred from shards.

The core logic involves calculating an 'effective K' per shard using the formula max(top_k/#shards, ceil(top_k × ratio)). The PR modifies internal command handling to dynamically adjust the K value in the query string sent to individual shards. It also includes an optimized string replacement mechanism to efficiently modify query arguments, handling both literal and parameterized K values.

Key Changes

• Introduced shard_k_ratio as a new query attribute for KNN vector queries.
• Implemented calculateEffectiveK function to determine the optimal number of results each shard should return based on the original K, the ratio, and the number of shards.
• Modified modifyKNNCommand to dynamically rewrite the K value in the query string sent to shards, supporting both literal (e.g., KNN`` 50`) and parameterized (e.g., KNN$k`) K values. • EnhancedMRCommand_ReplaceArgSubstring with an optimization for in-place string replacement (padding with spaces) when the new substring is shorter or equal in length, avoiding reallocations. • Updated query parsing (`src/query.c`) to store the position and length of the K token for efficient modification. • Integrated the optimization into distributed ```FT.SEARCH``` and ```FT.``AGGREGATE``` command processing.
• Added comprehensive C and Python unit/integration tests covering parameter validation, effective K calculation, command modification, and various query scenarios (literal K, parameterized K, hybrid queries, edge cases like K=0, and insufficient documents per shard).

Affected Areas

• Query parsing and AST construction (src/query.c, src/query_node.h)
• Vector index query structures (src/vector_index.h)
• Distributed command handling and modification (src/module.c, src/coord/dist_aggregate.c, src/coord/rmr/command.c, src/coord/rmr/command.h)
• Configuration definitions (src/config.h)
• Testing infrastructure (new tests/ctests/coord_tests/test_shard_window_ratio.c, tests/pytests/test_shard_window_ratio.py, and updates to tests/ctests/coord_tests/test_command.c)

This summary was automatically generated by @propel-code-bot

* RED-162822 Implement shard window ratio optimization - Add shard window ratio parameter to reduce coordinator workload - Implement effectiveK calculation based on ratio and shard count - Support both FT.SEARCH and FT.AGGREGATE operations * memory fixes * initial imp * replace substring * move params before kn ctx parsing remove shardWindowRatio from HybridIteratorParams * fix shard_window_ratio pytests add ranom vector revery unnecessary change error message fix test_cpp_index * c tests add k as param to profile pytest * remove release heap * small copilot recommendations * back header * remove todo * 1. **Simplified modifyKNNCommand API**: 2. **Unified K token tracking**: - Replaced separate `k_literal_pos/k_literal_len` fields with unified `k_token_pos/k_token_len` in VectorQuery - when k is given as param we modify the query string instead of chnaing the param value - Handles both literal K ("KNN 50") and parameter K ("KNN $k") cases 3. **Optimized string replacement**: - Enhanced `MRCommand_ReplaceArgSubstring` with space-padding optimization 5. **Enhanced test coverage**: - Added unit tests for MRCommand substring replacement functions - Added edge case test for K=0 scenarios in pytest - Tests cover both optimization and fallback code paths * Update shard window ratio tests for refactored API - **Simplified test helper**: Updated runModifyKNNTest to use k_token_pos/k_token_len instead of separate literal/parameter handling - **Enhanced parameter testing**: Added test with longer parameter name ($k_costume) to verify exact string matching - **Space-padding validation**: Tests now verify space-padding optimization in query string modifications - **Added calculateEffectiveK tests**: Comprehensive unit tests for edge cases including k=0, ratio comparisons, and rounding behavior - **Improved test coverage**: Tests both reallocation and optimization code paths Tests validate the new unified token tracking approach and performance optimizations. * fix type * move special_case_ctx.h to coord dir --------- Co-authored-by: meiravgri <[email protected]>

propel-code-bot · 2025-08-05T04:39:00Z

src/config.h

 #define DEFAULT_BG_OOM_PAUSE_TIME_BEFOR_RETRY 5
 #define DEFAULT_INDEXER_YIELD_EVERY_OPS 1000
+#define DEFAULT_SHARD_WINDOW_RATIO 1.0
+#define MIN_SHARD_WINDOW_RATIO 0.0  // Exclusive minimum (must be > 0.0)


[Documentation]

This comment is slightly confusing. While technically correct that it's an exclusive minimum, it could be clearer. Consider rephrasing to directly state the requirement for ratio.

propel-code-bot · 2025-08-05T04:39:01Z

src/coord/dist_aggregate.c

+  // Handle KNN with shard ratio optimization for both multi-shard and standalone
+  if (knnCtx) {
+    KNNVectorQuery *knn_query = &knnCtx->knn.queryNode->vn.vq->knn;
+    double ratio = knn_query->shardWindowRatio;
+
+    if (ratio < MAX_SHARD_WINDOW_RATIO) {
+      // Apply optimization only if ratio is valid and < 1.0 (ratio = 1.0 means no optimization)
+      // Calculate effective K based on deployment mode
+      size_t numShards = GetNumShards_UnSafe();
+      size_t effectiveK = calculateEffectiveK(knn_query->k, ratio, numShards);
+
+      // Modify the command to replace KNN k (shards will ignore $SHARD_K_RATIO)
+      modifyKNNCommand(xcmd, 2 + profileArgs, effectiveK, knnCtx->knn.queryNode->vn.vq);
+    }
+  }


[BestPractice]

This block of logic for handling the shard window ratio is very similar to the logic in src/module.c inside the prepareCommand function (lines 3298-3318). To avoid code duplication and improve maintainability, consider refactoring this into a shared helper function.

The implementation in module.c is slightly more optimized as it includes a check if (knn_query->k == effectiveK) break; to avoid unnecessary command modification. This check should be included in the shared function.

propel-code-bot · 2025-08-05T04:39:02Z

src/vector_index.h

 #define VECSIM_EPSILON "EPSILON"
 #define VECSIM_HYBRID_POLICY "HYBRID_POLICY"
 #define VECSIM_BATCH_SIZE "BATCH_SIZE"
+#define VECSIM_SHARD_WINDOW_RATIO "SHARD_WINDOW_RATIO"


[BestPractice]

This macro VECSIM_SHARD_WINDOW_RATIO appears to be unused. The macro SHARD_K_RATIO_ATTR from query_node.h is used for parsing this attribute. To keep the code clean, please remove this unused definition.

propel-code-bot bot reviewed Aug 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MOD-10359 Implement shard window ratio optimization #7

MOD-10359 Implement shard window ratio optimization #7

Uh oh!

tonyd3 commented Aug 5, 2025 •

edited by propel-code-bot bot

Loading

Uh oh!

propel-code-bot bot Aug 5, 2025

Uh oh!

propel-code-bot bot Aug 5, 2025

Uh oh!

propel-code-bot bot Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MOD-10359 Implement shard window ratio optimization #7

Are you sure you want to change the base?

MOD-10359 Implement shard window ratio optimization #7

Uh oh!

Conversation

tonyd3 commented Aug 5, 2025 • edited by propel-code-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

propel-code-bot bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tonyd3 commented Aug 5, 2025 •

edited by propel-code-bot bot

Loading