-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6332703
commit c62fa17
Showing
7 changed files
with
58 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,6 +45,7 @@ htmlcov/ | |
.cache | ||
nosetests.xml | ||
coverage.xml | ||
coverage.json | ||
*.cover | ||
*.py,cover | ||
.hypothesis/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,50 @@ | ||
<!--- BADGES: START ---> | ||
[data:image/s3,"s3://crabby-images/b0c25/b0c2549688492190875084978645e43322e38d0f" alt="HF Nuclia"](https://huggingface.co/nuclia) | ||
[data:image/s3,"s3://crabby-images/57767/577671de0384bb0a6c011ec275c9a94b47bc929f" alt="GitHub - License"][#github-license] | ||
[data:image/s3,"s3://crabby-images/613ed/613edcd3a5069a45eca2d1559c228106ba2e9b8d" alt="PyPI - Python Version"][#pypi-package] | ||
[data:image/s3,"s3://crabby-images/cff40/cff40e2e1338abd71eb5040c4b094a31e6ff4de8" alt="PyPI - Package Version"][#pypi-package] | ||
[data:image/s3,"s3://crabby-images/540f9/540f969043fa874d9da22c46f259cd55fa75a97d" alt="Code coverage"](https://github.com/nuclia/nuclia-eval/actions) | ||
|
||
|
||
[#github-license]: https://github.com/nuclia/nuclia-eval/blob/master/LICENSE | ||
[#pypi-package]: https://pypi.org/project/nuclia-eval/ | ||
<!--- BADGES: END ---> | ||
|
||
# nuclia-eval | ||
<p align="center"> | ||
<img src="assets/Nuclia_vertical.png" width="350" title="nuclia logo" alt="nuclia, the all-in-one RAG as a service platform."> | ||
</p> | ||
|
||
Library for evaluating RAG using Nuclia's models | ||
|
||
Its evaluation follows the RAG triad as proposed by [TruLens](https://www.trulens.org/trulens_eval/getting_started/core_concepts/rag_triad/): | ||
|
||
data:image/s3,"s3://crabby-images/587cd/587cd1fba724af97b36ecf1a4acf57c604a9ff23" alt="rag triad" | ||
|
||
In summary, the metrics **nuclia-eval** provides for a RAG Experience involving a **question** an **answer** and N pieces of **context** are: | ||
|
||
* **Answer Relevance**: Answer relevance refers to the directness and appropriateness of the response in addressing the specific question asked, providing accurate, complete, and contextually suitable information. | ||
* **score**: A score between 0 and 5 indicating the relevance of the answer to the question. | ||
* **reason**: A string explaining the reason for the score. | ||
* For each of the N pieces of context: | ||
* **Context Relevance Score**: The context relevance is the relevance of the **context** to the **question**, on a scale of 0 to 5. | ||
* **Groudedness Score**: Groundedness is defined as the degree of information overlap to which the **answer** contains information that is substantially similar or identical to that in the **context** piece. The score is between 0 and 5. | ||
|
||
## Available Models | ||
|
||
### REMi-v0 | ||
|
||
[REMi-v0](https://huggingface.co/nuclia/REMi-v0) (RAG Evaluation MetrIcs) is a LoRa adapter for the | ||
[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) model. | ||
|
||
It has been finetuned by the team at [**nuclia**](nuclia.com) to evaluate the quality of the overall RAG experience. | ||
|
||
## Usage | ||
|
||
```python | ||
from nuclia_eval import REMi | ||
|
||
evaluator = REMiEvaluator() | ||
|
||
... | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,4 @@ | ||
"""This module contains the ML models used to evaluate the quality of the RAG experience.""" | ||
from nuclia_eval.models.remi import REMiEvaluator | ||
|
||
__all__ = ["REMiEvaluator"] |