Skip to content

Conversation

ldematte
Copy link
Contributor

Extracted from #136411

Exposes the vector values from the Int7 scorer supplier, via a "capability" interface (same pattern as used elsewhere in Lucene, e.g. MemorySegmentAccess or HasIndexSlice)

@ldematte ldematte requested a review from ChrisHegarty October 10, 2025 16:12
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

import static org.apache.lucene.util.quantization.ScalarQuantizedVectorSimilarity.fromVectorSimilarity;

public abstract sealed class Int7SQVectorScorerSupplier implements RandomVectorScorerSupplier {
public abstract sealed class Int7SQVectorScorerSupplier implements RandomVectorScorerSupplier, QuantizedByteVectorValuesAccess {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't you have this interface satisfied by regular scorer supplier as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to generalise it beyond QuantizedByteVectorValues? Or for the case of a non-int7 scorer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess this is the only "custom" one we provide right now. I was thinking that we had other RandomVectorScorerSupplier objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have the same question as Ben, do we really need this QuantizedByteVectorValuesAccess? As we can get access to the quantized vectors through RandomVectorScorerSupplier -> values -> getSlice

Copy link
Contributor Author

@ldematte ldematte Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mayya-sharipova RandomVectorScorerSupplier does not have a values field or accessor; values is declared directly in Int7SQVectorScorerSupplier, which is an internal type we would like not to expose, hence QuantizedByteVectorValuesAccess (in a fashion similar to HasIndexSlice or MemorySegmentAccess).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benwtrent I see your point; we could actually generalize this a bit beyond QuantizedByteVectorValues, e.g. cover all our implementations of RandomVectorScorerSupplier (I count 4 of them in the ES codebase), and return ByteVectorValues.
Maybe it makes sense; eventually, we would like to get something similar into Lucene if possible...

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's got with this small interface for now, we can expand and generalise it later, if needed.

@ldematte ldematte merged commit 096e271 into elastic:main Oct 13, 2025
34 checks passed
@ldematte ldematte deleted the expose-vectors-from-scorer branch October 13, 2025 08:14
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 13, 2025
Exposes the vector values from the Int7 scorer supplier, via a "capability" interface (same pattern as used elsewhere in Lucene, e.g. MemorySegmentAccess or HasIndexSlice)
ldematte added a commit to ldematte/elasticsearch that referenced this pull request Oct 14, 2025
Exposes the vector values from the Int7 scorer supplier, via a "capability" interface (same pattern as used elsewhere in Lucene, e.g. MemorySegmentAccess or HasIndexSlice)
elasticsearchmachine pushed a commit that referenced this pull request Oct 14, 2025
Exposes the vector values from the Int7 scorer supplier, via a "capability" interface (same pattern as used elsewhere in Lucene, e.g. MemorySegmentAccess or HasIndexSlice)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>refactoring :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.1 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants