Skip to content

Conversation

ldematte
Copy link
Contributor

@ldematte ldematte commented Oct 10, 2025

This PR changes how we gather and compact vector data for transmitting them to the GPU. Instead of using a temporary file to write out the compacted arrays, we use directly the vector values from the scorer supplier, which are backed by a memory mapped input. This way we avoid an additional copy of the data.

@ldematte ldematte requested a review from a team as a code owner October 10, 2025 15:14
@ldematte ldematte added >non-issue auto-backport Automatically create backport pull requests when merged :Search Relevance/Vectors Vector search test-gpu Run tests using a GPU v9.2.1 v9.3.0 labels Oct 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 10, 2025
@ldematte ldematte changed the title [Gpu] Optimize merge memory usage [GPU] Optimize merge memory usage Oct 10, 2025
"-Dio.netty.noUnsafe=true",
"-Dio.netty.noKeySetOptimization=true",
"-Dio.netty.recycler.maxCapacityPerThread=0",
// temporary until we get access to raw vectors in a future Lucene version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an open Lucene issue or PR for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet; depending on how #136416 goes, and the opinion of people more expert in Lucene (you, Chris, Ben), I'd like to generalize what we did there and raise a Lucene issue to have it.

import java.lang.invoke.MethodHandles;
import java.lang.invoke.VarHandle;

public class VectorsFormatReflectionUtils {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, very nice organization! +1 for using VarHandle for reflection

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ldematte Great work, I have not tested it yet, but amazing work how you organized it. My main comment: do you think we can simplify this PR by breaking into two separate ones: making this PR only about changes to merges, and doing changes for flush, ResourcesHolder, 128Mb in a separate PR? Or these changes are tightly coupled?

@ldematte
Copy link
Contributor Author

doing changes for flush, ResourcesHolder, 128Mb in a separate PR?

I can do that: here is the PR #136464

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch test-gpu Run tests using a GPU v9.2.1 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants