Skip to content

Commit cfeaa66

Browse files
committed
VEC-223: Documentation for sparse and hybrid indexes
Added them under features, also updated the REST API specification.
1 parent f56b544 commit cfeaa66

File tree

14 files changed

+915
-16
lines changed

14 files changed

+915
-16
lines changed

mint.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -766,7 +766,9 @@
766766
"vector/features/filtering",
767767
"vector/features/embeddingmodels",
768768
"vector/features/namespaces",
769-
"vector/features/resumablequery"
769+
"vector/features/resumablequery",
770+
"vector/features/sparseindexes",
771+
"vector/features/hybridindexes"
770772
]
771773
},
772774
{

vector/api/endpoints/fetch-random.mdx

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,11 @@ The response will be `null` if the namespace is empty.
2323
<ResponseField name="id" type="string" required>
2424
The id of the vector.
2525
</ResponseField>
26-
<ResponseField name="vector" type="number[]" required>
27-
The vector value.
26+
<ResponseField name="vector" type="number[]">
27+
The dense vector value for dense and hybrid indexes.
28+
</ResponseField>
29+
<ResponseField name="sparseVector" type="Object[]">
30+
The sparse vector value for sparse and hybrid indexes.
2831
</ResponseField>
2932

3033
<RequestExample>

vector/api/endpoints/fetch.mdx

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,10 @@ their vector ids.
4949
The id of the vector.
5050
</ResponseField>
5151
<ResponseField name="vector" type="number[]">
52-
The vector value.
52+
The dense vector value for dense and hybrid indexes.
53+
</ResponseField>
54+
<ResponseField name="sparseVector" type="Object[]">
55+
The sparse vector value for sparse and hybrid indexes.
5356
</ResponseField>
5457
<ResponseField name="metadata" type="Object">
5558
The metadata of the vector, if any.

vector/api/endpoints/query-data.mdx

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,23 @@ of fields below.
4444
<ParamField body="filter" type="string" default="">
4545
[Metadata filter](/vector/features/filtering) to apply.
4646
</ParamField>
47+
<ParamField body="weightingStrategy" type="string">
48+
For sparse vectors of sparse and hybrid indexes, specifies what kind of
49+
weighting strategy should be used while querying the matching non-zero
50+
dimension values of the query vector with the documents.
51+
52+
If not provided, no weighting will be used.
53+
54+
Only possible value is `IDF` (inverse document frequency).
55+
</ParamField>
56+
<ParamField body="fusionAlgorithm" type="string">
57+
Fusion algorithm to use while fusing scores
58+
from dense and sparse components of a hybrid index.
59+
60+
If not provided, defaults to `RRF` (Reciprocal Rank Fusion).
61+
62+
Other possible value is `DBSF` (Distribution-Based Score Fusion).
63+
</ParamField>
4764

4865
## Path
4966

@@ -61,9 +78,12 @@ If the request was an array of more than one items, an array of
6178
objects below is returned, one for each query item.
6279

6380
<Note>
64-
The score is normalized to always be between 0 and 1.
81+
For dense indexes, the score is normalized to always be between 0 and 1.
6582
The closer the score is to 1, the more similar the vector is to the query vector.
6683
This does not depend on the distance metric you use.
84+
85+
For sparse and hybrid indexes, scores can be arbitrary values, but the score
86+
will be higher for more similar vectors.
6787
</Note>
6888

6989
<ResponseField name="Scores" type="Object[]">
@@ -75,7 +95,10 @@ objects below is returned, one for each query item.
7595
The similarity score of the vector, calculated based on the distance metric of your index.
7696
</ResponseField>
7797
<ResponseField name="vector" type="number[]">
78-
The vector value.
98+
The dense vector value for dense and hybrid indexes.
99+
</ResponseField>
100+
<ResponseField name="sparseVector" type="Object[]">
101+
The sparse vector value for sparse and hybrid indexes.
79102
</ResponseField>
80103
<ResponseField name="metadata" type="Object">
81104
The metadata of the vector, if any.

vector/api/endpoints/query.mdx

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,23 @@ of fields below.
4040
<ParamField body="filter" type="string" default="">
4141
[Metadata filter](/vector/features/filtering) to apply.
4242
</ParamField>
43+
<ParamField body="weightingStrategy" type="string">
44+
For sparse vectors of sparse and hybrid indexes, specifies what kind of
45+
weighting strategy should be used while querying the matching non-zero
46+
dimension values of the query vector with the documents.
47+
48+
If not provided, no weighting will be used.
49+
50+
Only possible value is `IDF` (inverse document frequency).
51+
</ParamField>
52+
<ParamField body="fusionAlgorithm" type="string">
53+
Fusion algorithm to use while fusing scores
54+
from dense and sparse components of a hybrid index.
55+
56+
If not provided, defaults to `RRF` (Reciprocal Rank Fusion).
57+
58+
Other possible value is `DBSF` (Distribution-Based Score Fusion).
59+
</ParamField>
4360

4461
## Path
4562

@@ -57,9 +74,12 @@ If the request was an array of more than one items, an array of
5774
objects below is returned, one for each query item.
5875

5976
<Note>
60-
The score is normalized to always be between 0 and 1.
77+
For dense indexes, the score is normalized to always be between 0 and 1.
6178
The closer the score is to 1, the more similar the vector is to the query vector.
6279
This does not depend on the distance metric you use.
80+
81+
For sparse and hybrid indexes, scores can be arbitrary values, but the score
82+
will be higher for more similar vectors.
6383
</Note>
6484

6585
<ResponseField name="Scores" type="Object[]">
@@ -71,7 +91,10 @@ objects below is returned, one for each query item.
7191
The similarity score of the vector, calculated based on the distance metric of your index.
7292
</ResponseField>
7393
<ResponseField name="vector" type="number[]">
74-
The vector value.
94+
The dense vector value for dense and hybrid indexes.
95+
</ResponseField>
96+
<ResponseField name="sparseVector" type="Object[]">
97+
The sparse vector value for sparse and hybrid indexes.
7598
</ResponseField>
7699
<ResponseField name="metadata" type="Object">
77100
The metadata of the vector, if any.

vector/api/endpoints/range.mdx

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,11 @@ authMethod: "GET"
5252
<ResponseField name="id" type="string" required>
5353
The id of the vector.
5454
</ResponseField>
55-
<ResponseField name="vector" type="number[]" required>
56-
The vector value.
55+
<ResponseField name="vector" type="number[]">
56+
The dense vector value for dense and hybrid indexes.
57+
</ResponseField>
58+
<ResponseField name="sparseVector" type="Object[]">
59+
The sparse vector value for sparse and hybrid indexes.
5760
</ResponseField>
5861
<ResponseField name="metadata" type="Object">
5962
The metadata of the vector, if any.

vector/api/endpoints/resumable-query/resume.mdx

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,10 @@ authMethod: "bearer"
2727
metric of your index.
2828
</ResponseField>
2929
<ResponseField name="vector" type="number[]">
30-
The vector value.
30+
The dense vector value for dense and hybrid indexes.
31+
</ResponseField>
32+
<ResponseField name="sparseVector" type="Object[]">
33+
The sparse vector value for sparse and hybrid indexes.
3134
</ResponseField>
3235
<ResponseField name="metadata" type="Object">
3336
The metadata of the vector, if any.

vector/api/endpoints/resumable-query/start-with-data.mdx

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,25 @@ authMethod: "bearer"
4040
Maximum idle time for the resumable query in seconds.
4141
</ParamField>
4242

43+
<ParamField body="weightingStrategy" type="string">
44+
For sparse vectors of sparse and hybrid indexes, specifies what kind of
45+
weighting strategy should be used while querying the matching non-zero
46+
dimension values of the query vector with the documents.
47+
48+
If not provided, no weighting will be used.
49+
50+
Only possible value is `IDF` (inverse document frequency).
51+
</ParamField>
52+
53+
<ParamField body="fusionAlgorithm" type="string">
54+
Fusion algorithm to use while fusing scores
55+
from dense and sparse components of a hybrid index.
56+
57+
If not provided, defaults to `RRF` (Reciprocal Rank Fusion).
58+
59+
Other possible value is `DBSF` (Distribution-Based Score Fusion).
60+
</ParamField>
61+
4362
## Path
4463

4564
<ParamField path="namespace" type="string" default="">

vector/api/endpoints/resumable-query/start-with-vector.mdx

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,25 @@ authMethod: "bearer"
4646
Maximum idle time for the resumable query in seconds.
4747
</ParamField>
4848

49+
<ParamField body="weightingStrategy" type="string">
50+
For sparse vectors of sparse and hybrid indexes, specifies what kind of
51+
weighting strategy should be used while querying the matching non-zero
52+
dimension values of the query vector with the documents.
53+
54+
If not provided, no weighting will be used.
55+
56+
Only possible value is `IDF` (inverse document frequency).
57+
</ParamField>
58+
59+
<ParamField body="fusionAlgorithm" type="string">
60+
Fusion algorithm to use while fusing scores
61+
from dense and sparse components of a hybrid index.
62+
63+
If not provided, defaults to `RRF` (Reciprocal Rank Fusion).
64+
65+
Other possible value is `DBSF` (Distribution-Based Score Fusion).
66+
</ParamField>
67+
4968
## Path
5069

5170
<ParamField path="namespace" type="string" default="">
@@ -69,7 +88,10 @@ authMethod: "bearer"
6988
metric of your index.
7089
</ResponseField>
7190
<ResponseField name="vector" type="number[]">
72-
The vector value.
91+
The dense vector value for dense and hybrid indexes.
92+
</ResponseField>
93+
<ResponseField name="sparseVector" type="Object[]">
94+
The sparse vector value for sparse and hybrid indexes.
7395
</ResponseField>
7496
<ResponseField name="metadata" type="Object">
7597
The metadata of the vector, if any.

vector/api/endpoints/update.mdx

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,12 @@ of those.
1919
The id of the vector.
2020
</ParamField>
2121
<ParamField body="vector" type="number[]">
22-
The vector value to update to.
22+
The dense vector value to update to for dense and hybrid indexes.
2323
<Note>The vector should have the same dimensions as your index.</Note>
2424
</ParamField>
25+
<ParamField body="sparseVector" type="Object[]">
26+
The sparse vector value to update to for sparse and hybrid indexes.
27+
</ParamField>
2528
<ParamField body="data" type="string">
2629
The raw text data to update to.
2730
<Note>If the index is created with an [embedding model](/vector/features/embeddingmodels)
@@ -38,6 +41,11 @@ of those.
3841
`OVERWRITE` for overwrite, `PATCH` for patch.
3942
</ParamField>
4043

44+
<Note>
45+
For hybrid indexes either none or both of `vector` and `sparseVector` fields
46+
must be present. It is not allowed to update only `vector` or `sparseVector`.
47+
</Note>
48+
4149
## Path
4250

4351
<ParamField path="namespace" type="string" default="">

0 commit comments

Comments
 (0)