Skip to content

feat(compression): headroom backend, progressive ladder, CCR, columnar packing#65

Open
doramirdor wants to merge 2 commits into
mainfrom
feat/compression
Open

feat(compression): headroom backend, progressive ladder, CCR, columnar packing#65
doramirdor wants to merge 2 commits into
mainfrom
feat/compression

Conversation

@doramirdor

Copy link
Copy Markdown
Collaborator

Context-optimizer upgrades for NadirClaw (single clean commit on top of the verifier branch).

  • Backend selection: optimizer backend = native | headroom. headroom-ai is a lazy, fail-open optional extra (pip install nadirclaw[headroom]) — absent/error falls back to native, byte-identical.
  • Progressive compression (compress_progressive): staged ladder native → headroom → offload, stops at a token budget. NADIRCLAW_OPTIMIZE adds progressive.
  • Native CCR (ccr.py): deterministic offload + nadir_retrieve fetch-back loop (no third-party store).
  • Columnar packing (json_array_pack) + whitespace fix that preserves code indentation.
  • Apache-2.0 attribution for headroom-ai in THIRD_PARTY_NOTICES.md.
  • Docs, benchmarks, and tests (ccr, progressive, json_array_pack, backends, code-safety).

Base is the verifier branch so the diff is only the compression commit.

🤖 Generated with Claude Code

Nadir and others added 2 commits May 29, 2026 09:09
NadirClaw's TrainedVerifier was passing the cheap answer as the bare
text_pair to the tokenizer. The model was trained on a structured format
with CHEAP:/EXPENSIVE: markers, matching what the Pro production backend
uses. Without that wrapper, scores are miscalibrated against the
production tau=0.80 threshold.

This patch wraps the input in the production format:

  text_pair = f"CHEAP:\n{cheap}\n\nEXPENSIVE:\n{reference or ''}"

reference_answer is now used when provided (was previously documented as
ignored). Behavior with reference_answer=None matches production: empty
string substitution.

Aligns NadirClaw with:
- https://huggingface.co/nadirclaw/cascade-verifier-v1 (model card)
- getnadir.dev/backend/app/services/verifier_model.py (production)

Repo: https://github.com/NadirRouter/NadirClaw
Service: https://getnadir.com
…r packing

- optimize: pluggable backend (native|headroom), lazy fail-open headroom-ai
  integration; progressive staged compression (compress_progressive) with native
  CCR offload + fetch-back loop (ccr.py); columnar JSON-array packing
  (json_array_pack); whitespace fix preserves code indentation.
- settings/cli/server: NADIRCLAW_OPTIMIZE adds 'progressive'; OPTIMIZE_BACKEND,
  target-tokens, max-stage, allow-lossy/offload knobs; serve --optimize progressive.
- pyproject: optional [headroom] extra (headroom-ai). Apache-2.0 attribution in
  THIRD_PARTY_NOTICES.md.
- docs + benchmarks + tests (ccr, progressive, json_array_pack, backends, code-safety).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@doramirdor

Copy link
Copy Markdown
Collaborator Author

Review (automated, scheduled run)

Verdict: ship — solid feature add, ladder logic and fail-open behaviour check out. One operational fix needed before merge.

✅ What I ran

  • All 5 new test files pass: test_ccr.py, test_progressive.py, test_optimize_backends.py, test_code_safety.py, test_json_array_pack.py45/45 green on this branch.
  • Pre-existing unrelated sse_starlette import errors in 3 other test files (not introduced here).

🔧 Blocker — base branch is a stale alias for main

fix/verifier-input-format-v0.19.2 was merged into main via #63 on 2026-05-29. git diff upstream/main upstream/fix/verifier-input-format-v0.19.2 is empty — the branch is content-identical to main. Merging this PR into the v0.19.2 branch will not propagate the compression feature to main; the branch is effectively a dead alias.

Fix: retarget PR base to main (gh pr edit 65 --base main). The diff stays the same because the bases are content-identical.

👍 What I verified

  • Backward compat: OPTIMIZE validator accepts the new progressive value alongside off/safe/aggressive (settings.py:258), and the CLI --optimize choice list matches (cli.py:42). Existing users keep working.
  • Fail-open of headroom: _headroom_optimize returns None on ImportError or runtime exception (lines 609-613, 629-631), caller falls through to native. _warn_headroom_once dedupes the warning. Same pattern in _headroom_stage (lines 707-723). ✓
  • CCR is in-memory only: no disk I/O, no path traversal surface. captured dict lives in the caller's process and is returned via OptimizeResult.offload_captured. ✓
  • Offload gating: compress_progressive requires both allow_offload=True (line 888) and an unmet target_tokens budget (line 900) before the offload stage engages. Both gates work together. ✓
  • Whitespace fix: the leading-indentation preservation at optimize.py:260-261 is real, and test_raw_code_stays_valid_python covers it for ast.parse-ability. Nice regression test.

⚠️ Design note (non-blocking) — columnar ⟦cols=…⟧ format

_pack_homogeneous_arrays rewrites homogeneous JSON arrays into a custom encoding (⟦cols=["id","name"]⟧\n[1,"foo"]\n…\n⟦end⟧). The PR docs claim this is "information-lossless and deterministically reversible via _unpack_table", which is true mechanically — but the LLM never sees the unpacked form; it has to infer the encoding from context. The 68% size win is real, but I'd suggest a small downstream eval (3–5 routed prompts where a tool returns a packed table) confirming the model can still answer questions about the rows. If you've already done this, ignore.

🐛 Pre-existing bug (not introduced here, file separately)

_dedup_tool_schemas at optimize.py:126-137 iterates _iter_json_objects(content) for absolute offsets but mutates new_content with the substitution, so when a single message contains 2+ duplicate tool schemas the second replacement lands at the wrong offset and corrupts content. Introduced in c4ba4bb (v0.13.0), not touched here. Will file a separate task.

🤖 Automated review by Claude Code (scheduled-task: daily-check-github-status)

@doramirdor doramirdor changed the base branch from fix/verifier-input-format-v0.19.2 to main June 10, 2026 13:08
@doramirdor

Copy link
Copy Markdown
Collaborator Author

Follow-up to my earlier review — retargeted this PR base from fix/verifier-input-format-v0.19.2main. The verifier branch was content-identical to main (git diff upstream/main upstream/fix/verifier-input-format-v0.19.2 was empty after #63 landed), so this is a base-pointer swap only; the diff and the 45/45 test results stand. Mergeable / clean.

🤖 Automated follow-up via scheduled triage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant