-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[GPU] Optimize merge memory usage #136411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
libs/simdvec/src/main/java/org/elasticsearch/simdvec/QuantizedByteVectorValuesAccess.java
Show resolved
Hide resolved
"-Dio.netty.noUnsafe=true", | ||
"-Dio.netty.noKeySetOptimization=true", | ||
"-Dio.netty.recycler.maxCapacityPerThread=0", | ||
// temporary until we get access to raw vectors in a future Lucene version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an open Lucene issue or PR for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet; depending on how #136416 goes, and the opinion of people more expert in Lucene (you, Chris, Ben), I'd like to generalize what we did there and raise a Lucene issue to have it.
import java.lang.invoke.MethodHandles; | ||
import java.lang.invoke.VarHandle; | ||
|
||
public class VectorsFormatReflectionUtils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, very nice organization! +1 for using VarHandle for reflection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ldematte Great work, I have not tested it yet, but amazing work how you organized it. My main comment: do you think we can simplify this PR by breaking into two separate ones: making this PR only about changes to merges, and doing changes for flush, ResourcesHolder, 128Mb in a separate PR? Or these changes are tightly coupled?
...rc/main/java/org/elasticsearch/index/codec/vectors/reflect/VectorsFormatReflectionUtils.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/index/codec/vectors/reflect/VectorsFormatReflectionUtils.java
Show resolved
Hide resolved
x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/ES92GpuHnswVectorsWriter.java
Outdated
Show resolved
Hide resolved
I can do that: here is the PR #136464 |
This PR changes how we gather and compact vector data for transmitting them to the GPU. Instead of using a temporary file to write out the compacted arrays, we use directly the vector values from the scorer supplier, which are backed by a memory mapped input. This way we avoid an additional copy of the data.