You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi I am currently working on a project which needs millions sometimes even billions of vectors to be inserted to build up a graph, and I follow the example.py in https://github.com/nmslib/hnswlib/tree/master with 4000K vectors like below code
p = hnswlib.Index('l2', dim)
print("before build ", datetime.datetime.now())
p.init_index(max_elements = num_elements, ef_construction = 128, M = 16)
p.add_items(vectorNP, ids)
p.save_index("/Users/XXX/Projects/builder/hnsw-embedding-test/python_test/combined.bin")
it took around 2 mins to finish,
but when use with libhnswlib-jna-x86-64 with 16 cores, by
val hnswIndex = new ConcurrentIndex(SpaceName.L2, dimension)
hnswIndex.initialize(3890521, 16, 128, 42)
val embeddingRecordsPar = parquet4sReader.toList.par
embeddingRecordsPar.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(16))
embeddingRecordsPar.foreach{ eb =>
val ba = eb.vectors.head
if (ba.length > 0) {
val vector = RawEmbedding.toVector(RichByteArray(ba).asByteBuffer, dimension, "float16")
hnswIndex.addNormalizedItem(vector, i)
i = i + 1
}
}
it is around 15-16mins (same time cost if I change ConcurrentIndex into Index or use Index.synchronizedIndex), all above two part of codes runnning in my local machine, I'm wondering if there is same function like add_items in this hnswlib-jna or any other ways that can faster the speed of building up graph?
The text was updated successfully, but these errors were encountered:
hi I am currently working on a project which needs millions sometimes even billions of vectors to be inserted to build up a graph, and I follow the example.py in https://github.com/nmslib/hnswlib/tree/master with 4000K vectors like below code
it took around 2 mins to finish,
but when use with libhnswlib-jna-x86-64 with 16 cores, by
it is around 15-16mins (same time cost if I change ConcurrentIndex into Index or use Index.synchronizedIndex), all above two part of codes runnning in my local machine, I'm wondering if there is same function like add_items in this hnswlib-jna or any other ways that can faster the speed of building up graph?
The text was updated successfully, but these errors were encountered: