Skip to content

Conversation

@tonyd3
Copy link

@tonyd3 tonyd3 commented Aug 5, 2025

Implement Shard Window Ratio Optimization for KNN Queries

This pull request introduces a new optimization for K-Nearest Neighbors (KNN) vector similarity queries in sharded environments. It adds a shard_k_ratio attribute, allowing users to specify a ratio that influences the number of results each shard processes and returns to the coordinator. This aims to reduce the coordinator's workload and improve overall query performance by limiting the data transferred from shards.

The core logic involves calculating an 'effective K' per shard using the formula max(top_k/#shards, ceil(top_k × ratio)). The PR modifies internal command handling to dynamically adjust the K value in the query string sent to individual shards. It also includes an optimized string replacement mechanism to efficiently modify query arguments, handling both literal and parameterized K values.

Key Changes

• Introduced shard_k_ratio as a new query attribute for KNN vector queries.
• Implemented calculateEffectiveK function to determine the optimal number of results each shard should return based on the original K, the ratio, and the number of shards.
• Modified modifyKNNCommand to dynamically rewrite the K value in the query string sent to shards, supporting both literal (e.g., KNN`` 50`) and parameterized (e.g., KNN$k`) K values. • EnhancedMRCommand_ReplaceArgSubstring with an optimization for in-place string replacement (padding with spaces) when the new substring is shorter or equal in length, avoiding reallocations. • Updated query parsing (`src/query.c`) to store the position and length of the K token for efficient modification. • Integrated the optimization into distributed ```FT.SEARCH``` and ```FT.``AGGREGATE``` command processing.
• Added comprehensive C and Python unit/integration tests covering parameter validation, effective K calculation, command modification, and various query scenarios (literal K, parameterized K, hybrid queries, edge cases like K=0, and insufficient documents per shard).

Affected Areas

• Query parsing and AST construction (src/query.c, src/query_node.h)
• Vector index query structures (src/vector_index.h)
• Distributed command handling and modification (src/module.c, src/coord/dist_aggregate.c, src/coord/rmr/command.c, src/coord/rmr/command.h)
• Configuration definitions (src/config.h)
• Testing infrastructure (new tests/ctests/coord_tests/test_shard_window_ratio.c, tests/pytests/test_shard_window_ratio.py, and updates to tests/ctests/coord_tests/test_command.c)


This summary was automatically generated by @propel-code-bot

* RED-162822 Implement shard window ratio optimization

- Add shard window ratio parameter to reduce coordinator workload
- Implement effectiveK calculation based on ratio and shard count
- Support both FT.SEARCH and FT.AGGREGATE operations

* memory fixes

* initial imp

* replace substring

* move params before kn ctx parsing

remove shardWindowRatio from HybridIteratorParams

* fix shard_window_ratio pytests

add ranom vector

revery unnecessary

change error message

fix test_cpp_index

* c tests

add k as param to profile pytest

* remove release heap

* small copilot recommendations

* back header

* remove todo

* 1. **Simplified modifyKNNCommand API**:

2. **Unified K token tracking**:
   - Replaced separate `k_literal_pos/k_literal_len` fields with unified
     `k_token_pos/k_token_len` in VectorQuery
     - when k is given as param we modify the query string instead of chnaing the param value
        - Handles both literal K ("KNN 50") and parameter K ("KNN $k") cases

3. **Optimized string replacement**:
   - Enhanced `MRCommand_ReplaceArgSubstring` with space-padding optimization

5. **Enhanced test coverage**:
   - Added unit tests for MRCommand substring replacement functions
   - Added edge case test for K=0 scenarios in pytest
   - Tests cover both optimization and fallback code paths

* Update shard window ratio tests for refactored API

- **Simplified test helper**: Updated runModifyKNNTest to use k_token_pos/k_token_len instead of separate literal/parameter handling

- **Enhanced parameter testing**: Added test with longer parameter name ($k_costume) to verify exact string matching
- **Space-padding validation**: Tests now verify space-padding optimization in query string modifications
- **Added calculateEffectiveK tests**: Comprehensive unit tests for edge cases including k=0, ratio comparisons, and rounding behavior
- **Improved test coverage**: Tests both reallocation and optimization code paths

Tests validate the new unified token tracking approach and performance optimizations.

* fix type

* move special_case_ctx.h to coord dir

---------

Co-authored-by: meiravgri <[email protected]>
#define DEFAULT_BG_OOM_PAUSE_TIME_BEFOR_RETRY 5
#define DEFAULT_INDEXER_YIELD_EVERY_OPS 1000
#define DEFAULT_SHARD_WINDOW_RATIO 1.0
#define MIN_SHARD_WINDOW_RATIO 0.0 // Exclusive minimum (must be > 0.0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Documentation]

This comment is slightly confusing. While technically correct that it's an exclusive minimum, it could be clearer. Consider rephrasing to directly state the requirement for ratio.

Copy Context

Comment on lines +613 to +627
// Handle KNN with shard ratio optimization for both multi-shard and standalone
if (knnCtx) {
KNNVectorQuery *knn_query = &knnCtx->knn.queryNode->vn.vq->knn;
double ratio = knn_query->shardWindowRatio;

if (ratio < MAX_SHARD_WINDOW_RATIO) {
// Apply optimization only if ratio is valid and < 1.0 (ratio = 1.0 means no optimization)
// Calculate effective K based on deployment mode
size_t numShards = GetNumShards_UnSafe();
size_t effectiveK = calculateEffectiveK(knn_query->k, ratio, numShards);

// Modify the command to replace KNN k (shards will ignore $SHARD_K_RATIO)
modifyKNNCommand(xcmd, 2 + profileArgs, effectiveK, knnCtx->knn.queryNode->vn.vq);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

This block of logic for handling the shard window ratio is very similar to the logic in src/module.c inside the prepareCommand function (lines 3298-3318). To avoid code duplication and improve maintainability, consider refactoring this into a shared helper function.

The implementation in module.c is slightly more optimized as it includes a check if (knn_query->k == effectiveK) break; to avoid unnecessary command modification. This check should be included in the shared function.

Copy Context

#define VECSIM_EPSILON "EPSILON"
#define VECSIM_HYBRID_POLICY "HYBRID_POLICY"
#define VECSIM_BATCH_SIZE "BATCH_SIZE"
#define VECSIM_SHARD_WINDOW_RATIO "SHARD_WINDOW_RATIO"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

This macro VECSIM_SHARD_WINDOW_RATIO appears to be unused. The macro SHARD_K_RATIO_ATTR from query_node.h is used for parsing this attribute. To keep the code clean, please remove this unused definition.

Copy Context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants