You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{
"actions": [
{
"ticket_id": "H001",
"priority": "urgent",
"department": "technical",
"response": "I'm treating this as a P0 incident...",
"needs_human": true,
"reasoning": "Enterprise SSO down = production blocker"
}
]
}
Reward Function
Each ticket is graded by a deterministic grader:
Component
Weight
Description
Priority accuracy
30%
Exact match = 1.0; off by 1 level = 0.5
Routing accuracy
30%
Exact match = 1.0; adjacent dept = 0.3
Response quality
25%
Keyword coverage from ground-truth model answers
Escalation correctness
15%
Correct needs_human flag
Partial credit is awarded throughout — the reward signal is dense, not sparse.
API Endpoints
Method
Path
Description
POST
/reset
Start new episode
POST
/step
Submit triage actions
GET
/state
Full environment snapshot
GET
/health
Liveness probe
GET
/tasks
List available tasks
GET
/openenv.yaml
Spec file
Setup & Usage
Local development
# Clone and install
git clone https://huggingface.co/spaces/YOUR_HF_USERNAME/support-triage
cd support-triage
pip install -r requirements.txt
# Run the servercd server
uvicorn server:app --host 0.0.0.0 --port 7860 --reload
Docker
docker build -t support-triage .
docker run -p 7860:7860 support-triage
# Test it
curl -X POST http://localhost:7860/reset -H "Content-Type: application/json" \
-d '{"task": "easy", "seed": 42}'
Run inference script
# Against local serverexport HF_TOKEN="your-hf-token"export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"export ENV_BASE_URL="http://localhost:7860"
python inference.py
# Against deployed Spaceexport ENV_BASE_URL="https://YOUR_SPACE.hf.space"
python inference.py
Exact replica of Tier-1 SaaS support triage workflow
3+ tasks with graders
easy/medium/hard with deterministic keyword+priority graders
Meaningful reward
4-component partial-credit reward, dense signal every step
OpenEnv spec
Full typed models, step/reset/state, openenv.yaml
Deployment
Docker + HF Spaces
Baseline script
inference.py with [START]/[STEP]/[END] logs
About
Gemini said CustomerSupportTriage-v0 is an OpenEnv-compliant benchmark for AI agents handling real-world support triage. Agents process ticket queues by assigning priorities, routing departments, drafting replies, and flagging for human review. It features three difficulty tiers and uses a partial-credit reward function.