-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow thresholding on vector and fulltext indexes for Hybrid retrievers #239
base: main
Are you sure you want to change the base?
Conversation
95dd2d9
to
4c1976c
Compare
@@ -159,6 +161,8 @@ def get_search_results( | |||
query_text (str): The text to get the closest neighbors of. | |||
query_vector (Optional[list[float]], optional): The vector embeddings to get the closest neighbors of. Defaults to None. | |||
top_k (int, optional): The number of neighbors to return. Defaults to 5. | |||
threshold_vector_index (float, optional): The minimum normalized score from the vector index to include in the top k search. | |||
threshold_fulltext_index (float, optional): The minimum normalized score from the fulltext index to include in the top k search. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point but users might not know or understand the normalisation process. It could be worth adding something to the docs with an example which explains a bit about what's going on with these parameters as I think on their own these descriptions might not be enough
@@ -37,12 +37,14 @@ def test_hybrid_search_basic() -> None: | |||
"YIELD node, score " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be worth adding a test to check the threshold process is working as expected? i.e. the correct scores are set to zero, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few minor points
Description
Allow thresholding on vector and fulltext indexes for Hybrid retrievers. Two thresholds can be provided by the user during search to determine the importance of the search results from either vector or fulltext index.
Type of Change
Complexity
Complexity: Low
How Has This Been Tested?
Checklist
The following requirements should have been met (depending on the changes in the branch):