A modular service for static LLM evaluation of LLM outputs.
- Python 3.12 or higher
- Dependencies are managed through
pyproject.toml
- Install UV (if not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh
- Create and activate a virtual environment using UV:
uv venv
source .venv/bin/activate # On Unix/macOS
# or
.venv\Scripts\activate # On Windows
- Install dependencies using UV:
uv pip install -e .
Located in scripts/chatbot/
:
-
simulate_convo.py
- Creates simulated conversations for testing and evaluation purposes- Usage:
uv run scripts/chatbot/simulate_convo.py
- Usage:
-
pre_merge_check.py
- Runs validation checks before merging code changes- Usage:
uv run scripts/chatbot/pre_merge_check.py
- Usage:
-
nightly_report.py
- Generates daily evaluation reports- Usage:
uv run scripts/chatbot/nightly_report.py
- Usage:
convo_eval.py
- Core evaluation script for analyzing conversations
- Usage:
uv run convo_eval.py
Located in scripts/faq_generator/
:
faq_eval.py
- Evaluates FAQ content using DeepEval metrics- Usage:
uv run scripts/faq_generator/faq_eval.py --input "Your prompt" --content "Generated FAQ content" --context "Reference material"
- Required arguments:
--input
: The input prompt text used to generate the FAQ--content
: The generated FAQ content to evaluate--context
: The reference material or ground truth to check against
- Output: Generates a CSV file in
deepeval_results/faq_eval/
with evaluation metrics including:- Hallucination score
- Evaluation reasoning
- Cost metrics
- Requirements:
- DEEPEVAL_API_KEY environment variable must be set
- Python 3.12 or higher
- DeepEval package installed
- Usage:
src/
- Source code directoryscripts/
- Utility scripts for various taskschatbot/
- Chatbot evaluation and testing scriptsfaq_generator/
- FAQ generation scripts
mock_data/
- Sample data for testingdeepeval_results/
- Output directory for evaluation results
Main dependencies include:
- deepeval (>=2.7.6)
- deepteam (>=0.0.9)
- Google API Client Libraries
- python-dotenv
The project uses environment variables for configuration. Create a .env
file in the root directory with necessary credentials and settings.
- Follow the existing code structure and style
- Update documentation as needed