Evals and environment packages (Prime / verifiers–style RLMs).
| Directory | Description |
|---|---|
lhaw/ |
LHAW RLM — underspecified prompts, ask_user clarification, reconstruction judge / native reward (lhaw_rlm) |
loca_bench_rlm/ |
LOCA Bench RLM |
long_context_retrieval/ |
Long-context retrieval |
discover_gsm8k/ |
Discover GSM8K |
autoresearch/ |
Autoresearch |
advanced_if/ |
Advanced IF — facebook/AdvancedIF rubric induction via RLMEnv (file-backed trajectories + partial judge tool, advanced_if) |
Each subdirectory is a self-contained package with its own pyproject.toml, uv.lock, and docs. Work inside that folder for install, eval, and packaging.