Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions .github/workflows/ci-monitoring.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
name: Monitoring Tests

on:
push:
paths:
- 'gpt_oss/monitoring/**'
- 'tests/monitoring/**'
- '.github/workflows/ci-monitoring.yml'
pull_request:
paths:
- 'gpt_oss/monitoring/**'
- 'tests/monitoring/**'
- '.github/workflows/ci-monitoring.yml'

jobs:
test-monitoring:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Cache pip dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ matrix.python-version }}-${{ hashFiles('**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-pip-${{ matrix.python-version }}-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[monitoring,test]"

- name: Run monitoring tests
run: |
pytest tests/monitoring/ -v --tb=short

- name: Run monitoring tests (fast only)
run: |
pytest tests/monitoring/ -v --tb=short -k "not slow"

- name: Test CLI
run: |
python -m gpt_oss.monitoring.__main__ --help

- name: Test basic functionality
run: |
python -c "
from gpt_oss.monitoring import HallucinationMonitor
monitor = HallucinationMonitor()
results = monitor.evaluate('Test prompt', 'Test completion')
print(f'Risk level: {results[\"risk_level\"]}')
print(f'Risk score: {results[\"risk_score\"]:.3f}')
"
39 changes: 38 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,41 @@ tmp*
__pycache__
*.egg*
node_modules/
*.log
*.log

# Monitoring demo files and reports
demo_reports/
sample_reports/
runs/
professional_reports/
*.html
*.pdf

# Virtual environments
monitoring_env/
venv/
env/
.env

# IDE and editor files
.vscode/
.idea/
*.swp
*.swo
*~

# OS generated files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

# Python cache and compiled files
*.pyc
*.pyo
*.pyd
.Python
*.so
102 changes: 102 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ Both models were trained using our [harmony response format][harmony] and should
- [Harmony format & tools](#harmony-format--tools)
- [Clients](#clients)
- [Tools](#tools)
- [Hallucination Monitor](#hallucination-monitor)
- [Other details](#other-details)
- [Contributing](#contributing)

Expand Down Expand Up @@ -478,6 +479,107 @@ if last_message.recipient == "python":

`apply_patch` can be used to create, update or delete files locally.

## Hallucination Monitor

The GPT-OSS Hallucination Monitor is a comprehensive system for detecting hallucination risk in LLM outputs. It provides multiple detection signals and generates detailed reports to help identify potential issues in model completions.

### ๐ŸŒ Web Interface (Recommended)

The easiest way to try the hallucination monitor is through our beautiful web interface:

```bash
# Install dependencies
pip install streamlit plotly
pip install -e ".[monitoring]"

# Launch the web interface
python gpt_oss/monitoring/run_web_app.py
```

This opens a beautiful web interface at `http://localhost:8501` with:
- Beautiful visualizations (gauge charts, radar plots, bar charts)
- Interactive configuration sliders
- Quick example buttons
- Real-time analysis results
- Color-coded risk highlighting
- HTML report generation

### ๐ŸŽฏ Comprehensive Demo

Run the full demo to see all features in action:

```bash
# Run the comprehensive demo
python -m gpt_oss.monitoring.demo.demo
```

The demo showcases:
- Basic usage examples
- Different detection signals (truthful, hallucinated, numeric errors)
- Professional HTML report generation
- Web interface information
- CLI usage examples
- Advanced configuration features

### Quick Start

```bash
# Install monitoring dependencies
pip install -e ".[monitoring]"

# Basic usage
gpt-oss-monitor --prompt prompt.txt --completion output.txt

# With context documents and HTML report
gpt-oss-monitor --prompt prompt.txt --completion output.txt --contexts ctx1.txt ctx2.txt --html
```

### Python API

```python
from gpt_oss.monitoring import HallucinationMonitor, MonitorConfig

# Initialize monitor
monitor = HallucinationMonitor()

# Evaluate a completion
results = monitor.evaluate(
prompt="What is the capital of France?",
completion="Paris is the capital of France with 2.2 million people.",
context_docs=["Paris is the capital and most populous city of France."]
)

print(f"Risk Level: {results['risk_level']}")
print(f"Risk Score: {results['risk_score']:.3f}")
```

### Detection Signals

The monitor uses five key signals to assess hallucination risk:

- **Self-Consistency (SC)**: Generates k samples and computes semantic agreement
- **NLI Faithfulness**: Checks sentence-level entailment against prompt and context
- **Numeric Sanity**: Detects arithmetic and unit consistency issues
- **Retrieval Support**: Verifies claims against provided context documents
- **Jailbreak Heuristics**: Identifies potential safety risks

### Features

- **Configurable**: Customize thresholds, weights, and detection parameters
- **HTML Reports**: Beautiful, interactive reports with highlighted spans
- **CLI Interface**: Easy command-line usage with file inputs
- **Lightweight**: CPU-optional with fallback heuristics
- **Deterministic**: Seeded RNG for reproducible results
- **Web Interface**: Sexy Streamlit app with real-time analysis
- **Professional Design**: GPT-OSS branded with sexy blue color scheme

### Examples

For detailed examples and usage patterns, see:
- [Monitoring Examples](gpt_oss/monitoring/examples/README.md)
- [Demo Script](gpt_oss/monitoring/demo/README.md)
- [Web Interface](gpt_oss/monitoring/run_web_app.py)

## Other details

### Precision format
Expand Down
Loading