Test CI

nuclia · Jul 19, 2024 · c62fa17 · c62fa17
1 parent 6332703
commit c62fa17
Show file tree

Hide file tree

Showing 7 changed files with 58 additions and 3 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -42,5 +42,8 @@ jobs:
               coverageFile: coverage.xml
               token: ${{ secrets.GITHUB_TOKEN }}
               thresholdAll: 0.9
-
-
+        - name: Update Coverage Badge
+          # GitHub actions: default branch variable
+          # Disable as of now to test the action
+          # if: github.ref == format('refs/heads/{0}', github.event.repository.default_branch)
+          uses: we-cli/coverage-badge-action@main
diff --git a/.gitignore b/.gitignore
@@ -45,6 +45,7 @@ htmlcov/
 .cache
 nosetests.xml
 coverage.xml
+coverage.json
 *.cover
 *.py,cover
 .hypothesis/

diff --git a/README.md b/README.md
@@ -1,2 +1,50 @@
+<!--- BADGES: START --->
+[![HF Nuclia](https://img.shields.io/badge/%F0%9F%A4%97-models-yellow)](https://huggingface.co/nuclia)
+[![GitHub - License](https://img.shields.io/github/license/nuclia/nuclia-eval?logo=github&style=flat&color=green)][#github-license]
+[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/nuclia-eval?logo=pypi&style=flat&color=blue)][#pypi-package]
+[![PyPI - Package Version](https://img.shields.io/pypi/v/nuclia-eval?logo=pypi&style=flat&color=orange)][#pypi-package]
+[![Code coverage](https://nuclia.github.io/nuclia-eval/badges/coverage.svg)](https://github.com/nuclia/nuclia-eval/actions)
+
+
+[#github-license]: https://github.com/nuclia/nuclia-eval/blob/master/LICENSE
+[#pypi-package]: https://pypi.org/project/nuclia-eval/
+<!--- BADGES: END --->
+
 # nuclia-eval
+<p align="center">
+  <img src="assets/Nuclia_vertical.png" width="350" title="nuclia logo" alt="nuclia, the all-in-one RAG as a service platform.">
+</p>
+
 Library for evaluating RAG using Nuclia's models
+
+Its evaluation follows the RAG triad as proposed by [TruLens](https://www.trulens.org/trulens_eval/getting_started/core_concepts/rag_triad/):
+
+![rag triad](assets/RAG_Triad.jpg)
+
+In summary, the metrics **nuclia-eval** provides for a RAG Experience involving a **question** an **answer** and N pieces of **context** are:
+
+* **Answer Relevance**: Answer relevance refers to the directness and appropriateness of the response in addressing the specific question asked, providing accurate, complete, and contextually suitable information.
+    * **score**: A score between 0 and 5 indicating the relevance of the answer to the question.
+    * **reason**: A string explaining the reason for the score.
+* For each of the N pieces of context:
+    * **Context Relevance Score**: The context relevance is the relevance of the **context** to the **question**, on a scale of 0 to 5.
+    * **Groudedness Score**: Groundedness is defined as the degree of information overlap to which the **answer** contains information that is substantially similar or identical to that in the **context** piece. The score is between 0 and 5.
+
+## Available Models
+
+### REMi-v0
+
+[REMi-v0](https://huggingface.co/nuclia/REMi-v0) (RAG Evaluation MetrIcs) is a LoRa adapter for the 
+[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) model. 
+
+It has been finetuned by the team at [**nuclia**](nuclia.com) to evaluate the quality of the overall RAG experience.
+
+## Usage
+
+```python
+from nuclia_eval import REMi
+
+evaluator = REMiEvaluator()
+
+...
+```
diff --git a/assets/Nuclia_vertical.png b/assets/Nuclia_vertical.png
diff --git a/assets/RAG_Triad.jpg b/assets/RAG_Triad.jpg
diff --git a/pyproject.toml b/pyproject.toml
@@ -10,7 +10,7 @@ source = "src"
 
 [tool.pytest.ini_options]
 testpaths = ["./tests"]
-addopts = "--cov=nuclia_eval --cov-report=xml --cov-report term"
+addopts = "--cov=nuclia_eval --cov-report=xml --cov-report term --cov-report json"
 
 [tool.mypy]
 ignore_missing_imports = true

diff --git a/src/nuclia_eval/models/__init__.py b/src/nuclia_eval/models/__init__.py
@@ -1 +1,4 @@
 """This module contains the ML models used to evaluate the quality of the RAG experience."""
+from nuclia_eval.models.remi import REMiEvaluator
+
+__all__ = ["REMiEvaluator"]