Name	Name	Last commit message	Last commit date
parent directory ..
tests	tests
README.md	README.md
__init__.py	__init__.py
__main__.py	__main__.py
answer_synthesis.py	answer_synthesis.py
compare_outputs.py	compare_outputs.py
config.py	config.py
conftest.py	conftest.py
corpus.py	corpus.py
cross_ecosystem.py	cross_ecosystem.py
csv_io.py	csv_io.py
eval_metrics.py	eval_metrics.py
eval_sample.py	eval_sample.py
grounding.py	grounding.py
main.py	main.py
openai_agent.py	openai_agent.py
postprocess.py	postprocess.py
pytest.ini	pytest.ini
requirements.txt	requirements.txt
response_quality_report.py	response_quality_report.py
retrieve.py	retrieve.py
risk.py	risk.py
run_eval.py	run_eval.py
taxonomy.py	taxonomy.py
ticket_hints.py	ticket_hints.py

Support triage agent (Orchestrate)

Evaluators: primary setup + approach overview for the whole submission is the repository root ../README.md. This file focuses on code/ module details and flags.

Terminal agent that reads support_tickets/support_tickets.csv, retrieves grounded snippets from the offline data/ corpus (BM25 + TF‑IDF fusion + lexical rerank), applies risk-based escalation rules + taxonomy mapping, and writes predictions to support_tickets/output.csv.

Design rationale & decision flowchart: ../docs/decisions.md. Interview / demo / rubric: ../docs/interview.md, ../docs/demo-script.md, ../docs/DEV_EVAL.md.

Setup

cd code
python -m venv .venv
.venv\Scripts\activate   # Windows
# source .venv/bin/activate  # macOS / Linux
pip install -r requirements.txt

Copy the repo’s .env.example to the repository root as .env and set at least:

OPENAI_API_KEY=sk-...
# optional:
# OPENAI_MODEL=gpt-4o-mini
# ORCHESTRATE_SEED=42
# TOP_K=6
# LOW_BM25_THRESHOLD=7.0
# ORCHESTRATE_DISABLE_LLM=1
# ORCHESTRATE_INDEX_VERSION=2
# HYBRID_CANDIDATES=160
# BM25_WEIGHT=0.55
# TFIDF_WEIGHT=0.45
#
# Grounding (post-generation checks vs retrieved text):
# ORCHESTRATE_GROUNDING_MIN_OVERLAP=0.12
# ORCHESTRATE_GROUNDING_FAIL_MODE=resynthesize   # or: escalate
#
# Lexical rerank bonuses (query term + brand alignment):
# ORCHESTRATE_RERANK_BONUS_TEAM=5
# ORCHESTRATE_RERANK_BONUS_WORKSPACE=5
# ORCHESTRATE_RERANK_BONUS_BRAND=3

If OPENAI_API_KEY is unset (or ORCHESTRATE_DISABLE_LLM=1), the agent uses an offline synthesis path (structured steps from retrieved articles; weaker than a quota-available LLM, but fully corpus-grounded).

Run

From the repository root (recommended — avoids shadowing Python’s stdlib code module on Linux):

python code/main.py

Or from the code/ directory:

python main.py

(python -m code is unreliable because it may load the stdlib code module instead of this folder.)

Options:

--input   path to input CSV (default: ../support_tickets/support_tickets.csv)
--output  path to output CSV (default: ../support_tickets/output.csv)
--limit N process only the first N rows (default 0 = all rows; must be >= 0)
--fail-fast        exit on first row exception (exit 2); default is write escalated placeholder rows
--progress         tqdm progress bar (requires `tqdm` installed)
--max-field-chars N cap Issue/Subject length per row (default: env ORCHESTRATE_MAX_FIELD_CHARS or 200000)

Exit codes: 0 success; 2 user error (missing input, bad CSV schema, bad --limit / --max-field-chars, unusable --output, missing data/, index lock timeout, --fail-fast row error). Exceptions in row processing are caught by default (escalated row); otherwise 1 for unexpected crashes.

The first run builds a retrieval index under code/.cache/bm25_index.pkl. Delete it if you change chunking/fusion logic or bump ORCHESTRATE_INDEX_VERSION.

Quick regression (sample file)

python run_eval.py --offline

Optional diagnostics on the generated sample_pred.csv:

python run_eval.py --offline --report-quality

eval_sample.py reports exact match on routing columns and fuzzy stats on response (normalized exact, token F1, character overlap). Use them to catch regressions on free-text answers; the official holdout is still scored by the platform.

Compare any gold CSV to predictions (e.g. internal dev labels with Justification):

python compare_outputs.py --gold ../path/to/gold.csv --pred ../support_tickets/output.csv

Tests

cd code
python -m pytest tests -q

CI (GitHub Actions) runs the same pytest + run_eval.py --offline on each push/PR.

Architecture

Module	Role
`corpus.py`	Loads markdown, strips noise, chunks articles
`retrieve.py`	Hybrid retrieval index + brand filtering + rerank
`taxonomy.py`	Canonical `product_area` mapping
`grounding.py` / `postprocess.py`	Cheap grounding checks + finalize
`risk.py`	Regex escalation triggers (e.g. grading disputes)
`openai_agent.py`	JSON-grounded LLM decisioning + offline synthesis
`eval_metrics.py` / `eval_sample.py`	Sample regression metrics
`ticket_hints.py`	Optional multi-topic heuristics (tests / docs)
`main.py`	CSV orchestration

Secrets are read only from the environment (never commit .env).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Support triage agent (Orchestrate)

Setup

Run

Quick regression (sample file)

Tests

Architecture

FilesExpand file tree

code

Directory actions

More options

Directory actions

More options

Latest commit

History

code

Folders and files

parent directory

README.md

Support triage agent (Orchestrate)

Setup

Run

Quick regression (sample file)

Tests

Architecture