Skip to content

Regression fix: Cache the vector index size#2341

Merged
CascadingRadium merged 13 commits into
masterfrom
blas
Jun 1, 2026
Merged

Regression fix: Cache the vector index size#2341
CascadingRadium merged 13 commits into
masterfrom
blas

Conversation

@CascadingRadium

Copy link
Copy Markdown
Member
  • Currently we always compute the vector index size, irrespective of whether this size was cached in the snapshot's cachedMeta or not. We use the cached value when triggering the searcher start callback, but do not check if it exists in the cachedMeta before calling the Size() API.
  • Due to recent changes, the output of the Size() API was fixed but at the cost of additional computation to get a precise estimate. This results in a perf regression due to the aforementioned point.
  • Fix this issue by performing a LoadOrStore operation for the size instead, where we perform a read to check if the size was cached, and only compute it if not cached.
  • Refactor cachedMeta to use a sync.Map instead of a map with a RW mutex as its more suited for the write-once-read-many style of usage which we use it for in this perspective.
  • Fix the TotKNNSearches to be equal to the number of KNN searchers created, which mirrors the FTS equivalent TotTermSearchersFinished, currently this stat is equal to N * X where N is the number of segments, refactor it to report just X.

@coveralls

coveralls commented Jun 1, 2026

Copy link
Copy Markdown

Coverage Status

coverage: 52.323% (+0.02%) from 52.304% — blas into master

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a performance regression in Scorch vector (kNN) search by avoiding repeated computation of per-segment vector index sizes, and adjusts kNN-related stats accounting to better reflect logical searcher counts.

Changes:

  • Refactors SegmentSnapshot.cachedMeta to use sync.Map and adds typed-ish accessor helpers (newCachedMeta, load, store, contains).
  • Updates kNN optimization flow to reuse cached vector index sizes (avoiding repeated Size() calls when already cached).
  • Changes TotKNNSearches accounting to increment once per vector reader lifecycle (via IndexSnapshotVectorReader.Close()), and updates SegmentSnapshot constructors to initialize cachedMeta.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
index/scorch/snapshot_segment.go Replaces mutex+map cached metadata with sync.Map wrapper and new constructor/helpers.
index/scorch/optimize_knn.go Refactors kNN optimization execution and adds caching check to skip repeated Size() computation.
index/scorch/snapshot_index_vr.go Adds TotKNNSearches increment on vector reader close.
index/scorch/scorch_test.go Updates test helper to initialize cachedMeta via newCachedMeta().
index/scorch/persister.go Initializes cachedMeta via newCachedMeta() when loading segments.
index/scorch/introducer.go Initializes cachedMeta via newCachedMeta() when introducing/merging segments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread index/scorch/optimize_knn.go
Comment thread index/scorch/optimize_knn.go
Comment thread index/scorch/snapshot_index_vr.go
Comment thread index/scorch/snapshot_segment.go
@CascadingRadium CascadingRadium merged commit 8fa46fb into master Jun 1, 2026
10 checks passed
@CascadingRadium CascadingRadium deleted the blas branch June 1, 2026 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants