Goal: Show trust, grounding, and escalation — not feature laundry.
- Repo with
data/present;pip install -r code/requirements.txt. export ORCHESTRATE_DISABLE_LLM=1(repeatable) or show both offline + LLM if quota allows.
- Point to
support_tickets/sample_support_tickets.csv— realistic ticket. - Run from repo root:
python code/main.py --limit 1(orscripts/run_agent.sh) writing to a temp output. - Open row: show
status=replied,justificationciting retrieval path/score language.
Say: “Answers are composed from retrieved chunks only in offline mode.”
- Show
risk.pypattern example (grading dispute / outage wording) or a ticket that triggersescalated. - Show
status=escalatedand generic safe response — no invented policy.
Say: “We escalate before we invent.”
- Short gratitude or trivia-style row →
request_type=invalidor out-of-scope reply.
Say: “Invalid bucket avoids wasting human escalation.”
python -m pytest code/tests -qpython code/run_eval.py --offline— routing metrics on sample.
Say: “Regression + CLI resilience so iteration doesn’t rot quality.”
“We optimized trustworthy routing and corpus-grounded text, not parametric knowledge—and we know exactly where paraphrase and multilingual inputs degrade.”