feat(inverse-design): cardinality cap + fixed-amount pinning + min-weight floor#20
Conversation
…ight floor
Adds three orthogonal per-element constraint knobs to optimize_composition, plus a
single-knob unified annealing API for the cardinality cap's iterative-softmax schedule.
A — max_elements (cardinality cap)
Differentiable Plötz-Roth iterative soft top-K mask multiplied with softmax(logits),
temperature-annealed during optimisation; final hard top-K projection guarantees the
returned recipe is exactly K-hot (subject to floor C).
Annealing API consolidated into:
annealing_scale: float in [0, 1] (default 0.5; maps to tau_start = 25**scale,
calibrated against an 80-config sweep on the inverse-design fine-tuned model)
annealing_schedule: optional dict for advanced piecewise control
({step, tau, annealing_func} parallel lists; tail falls back to default
geometric schedule if step[-1] < 1.0)
B — fixed_amounts (user-pinned absolute amounts)
Mapping[symbol, float] to pin specific elements at user-given values; reuses the
existing lock-paste machinery used by element_step_scale=0 (validated disjoint).
Unlike the step_scale=0 path, fixed_amounts does NOT require initial_weights —
values come straight from the kwarg.
C — min_nonzero_weight (trace-amount floor)
Drops unlocked positions with 0 < w < floor and redistributes their mass to the
remaining unlocked positions. Per-row safe fallback when all unlocked would drop:
the row is left unfloored to keep the simplex invariant.
Orthogonal composition — every pair and the full A+B+C stack is contract-tested and
behaviourally validated on two paper scenarios. Validation rules enforce the few
genuine impossibilities (e.g. fixed amounts below floor) up front.
paper_inverse_comparison.py threads max_elements / annealing_scale / annealing_schedule
from each comp-config row; three new comp rows added (K=3, K=5, K=5 linear) — the
existing 5 comp + 3 latent bars are byte-identical to before.
Documentation refreshed:
* docs/inverse_design_extension_notes.md — rewritten from "future plans" to
"implemented" with composition matrix + behavioural-evidence cross-references.
* docs/inverse_design_algorithms.md — new kwargs added to the constraints and
main-parameters tables.
* README.md, ARCHITECTURE.md — short pointer to the new constraint surface.
Reproducibility scripts under logs/ (.py files tracked, .json/.png/.log outputs
git-ignored):
* sweep_tau_schedule.py + plot_sweep.py — annealing calibration grid
* test_max_elements_smoke.py — minimal A smoke test
* eval_combined_abc.py + plot_combined_abc.py — pair/full-stack chart
* eval_abc_intuition.py + plot_abc_intuition.py — behavioural sweep with
PASS/FAIL intuition checks (this run: 13/13 contracts + 11/12 monotone-
intuition checks; the single failure is a multi-objective trade-off, not a bug)
Tests: 93 passing (was 71; +21 contract + validation tests covering each of A, B,
C and their combinations + endpoints of the annealing knob). All existing tests
still pass byte-identical numerically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the inverse-design composition optimizer (FlexibleMultiTaskModel.optimize_composition) with three orthogonal per-element constraint knobs (cardinality cap, fixed-amount pinning, min-weight floor) and introduces a unified annealing API for the cardinality constraint’s iterative-softmax schedule. It also threads the new knobs through the paper comparison script, updates documentation, adds extensive tests, and adds reproducibility scripts under logs/ (with outputs git-ignored).
Changes:
- Add
max_elements,annealing_scale/annealing_schedule,fixed_amounts, andmin_nonzero_weighttooptimize_composition, plus helper logic for soft/hard top-K, lock-paste, and floor enforcement. - Add test coverage for A/B/C constraints, their combinations, and validation paths.
- Update paper/evaluation scripts and docs to expose and explain the new constraint surface; adjust
.gitignoreto ignore generatedlogs/artifacts.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/foundation_model/scripts/paper_inverse_comparison.py | Adds new composition config rows for max_elements and threads max_elements / annealing_* kwargs into optimize_composition calls. |
| src/foundation_model/models/flexible_multi_task_model.py | Implements constraint surface + annealing schedule logic inside optimize_composition. |
| src/foundation_model/models/flexible_multi_task_model_test.py | Adds extensive contract/validation tests for max-elements, fixed pinning, and floor behavior (alone and in combination). |
| README.md | Documents the new optimize_composition constraint knobs at a high level. |
| logs/test_max_elements_smoke.py | Adds a real-model smoke script to validate/max_elements behavior and trajectory annealing qualitatively. |
| logs/sweep_tau_schedule.py | Adds a sweep script (currently still uses removed topk_* kwargs; needs updating). |
| logs/plot_sweep.py | Plotting utility for the sweep JSON output. |
| logs/plot_combined_abc.py | Plotting utility for combined A/B/C evaluation results. |
| logs/plot_abc_intuition.py | Plotting utility for intuition-sweep JSON output. |
| logs/eval_combined_abc.py | End-to-end evaluation script for individual and combined A/B/C constraints. |
| logs/eval_abc_intuition.py | Behavioral sweep + intuition checks across scenarios for A/B/C and combinations. |
| docs/inverse_design_extension_notes.md | Updates the code map to reflect implemented A/B/C constraints and scheduling. |
| docs/inverse_design_algorithms.md | Updates the math/algorithm doc to include the new kwargs and constraints. |
| ARCHITECTURE.md | Adds a brief pointer to the new constraint knobs in the architecture overview. |
| .gitignore | Ignores generated `logs/*.json |
Comments suppressed due to low confidence (1)
src/foundation_model/models/flexible_multi_task_model.py:3144
- This comment still refers to
topk_tau_start, but the implementation now derives τ fromannealing_scale/annealing_schedule. Please update the wording to match the new API (e.g. “initial scoring uses the softest τ at step 0 of the annealing schedule”).
# --- Record initial scores --------------------------------------------------------------
# Initial scoring uses ``topk_tau_start`` so the t=0 view matches the softest end of
# the annealing schedule (the optimisation actually starts there).
current_tau[0] = _tau_for_step(0)
| τ is annealed from ``topk_tau_start`` to ``topk_tau_end`` over ``steps``. The | ||
| annealing doubles as a continuation method that helps escape local optima. |
| else: | ||
| locked_mask = locked_mask | fixed_mask_dev # validated disjoint | ||
| locked_w0 = locked_w0 + fixed_w0_batch | ||
|
|
| # a hard top-K projection so the returned ``optimized_weights`` has *exactly* K | ||
| # non-zero positions — at τ_end ≈ 0.01 the soft mask is already near-K-hot, so the | ||
| # projection just cleans up the residual sub-threshold weights. |
| if max_elements is not None: | ||
| kwargs.update( | ||
| max_elements=max_elements, | ||
| topk_tau_start=tau_start, | ||
| topk_tau_end=tau_end, | ||
| topk_schedule=schedule, | ||
| ) | ||
| res = model.optimize_composition(kernel, **kwargs) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3e07cea92e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if n_locked_pre > max_elements: | ||
| raise ValueError( | ||
| f"max_elements={max_elements} is smaller than the number of hard-locked " | ||
| f"elements ({n_locked_pre}, counting element_step_scale=0 ∪ fixed_amounts). " | ||
| "Locked elements always count toward K — raise max_elements or unlock some." | ||
| ) |
There was a problem hiding this comment.
Reject fully occupied K-hot lock sets with leftover mass
This validation allows max_elements == n_locked_pre for element_step_scale=0 locks, but that is invalid whenever the locked seed weights sum to less than 1. In that case the soft top-K path selects only locked indices (K - n_locked = 0), and the later lock-paste step has no unlocked support to absorb 1 - Σlocked, so returned rows can sum to < 1 instead of staying on the simplex. This is reachable with locked seeds like 0.2/0.2/0.2 and max_elements=3, and it silently distorts optimization by scaling descriptors down.
Useful? React with 👍 / 👎.
| topk_tau_start=tau_start, | ||
| topk_tau_end=tau_end, | ||
| topk_schedule=schedule, |
There was a problem hiding this comment.
Replace removed top-K kwargs in tau sweep script
The sweep script still calls optimize_composition with topk_tau_start, topk_tau_end, and topk_schedule, but this commit removed those parameters in favor of annealing_scale and annealing_schedule. As soon as max_elements is set, this call path now raises an unexpected-keyword TypeError, so the new reproducibility script cannot run.
Useful? React with 👍 / 👎.
External reviewers (Copilot + Codex) and self-review flagged 8 items. All addressed: Real bugs: * `element_step_scale=0` locks + `fixed_amounts` together could produce `locked_w0.sum > 1.0` and silently break the simplex. Added runtime check on the combined locked mass with a clear error message. * `max_elements == n_locked_pre` for the step_scale=0 path was previously allowed but silently produced row sums < 1 (the lock-paste had no unlocked slot for the leftover mass). Tightened to strict `max_elements > n_locked` for both lock sources, matching the existing fixed_amounts rule. New contract test pins the rejection. * `logs/sweep_tau_schedule.py` still used the removed `topk_*` kwargs. Adapted to the new annealing API: raw τ → normalised scale via `log(τ)/log(25)`, with schedule type expressed via `annealing_schedule` dict when not geometric. Wording / consistency: * Renamed `annealing_schedule["tau"]` → `["scale"]` everywhere (model code, validation messages, tests, paper-comparison config row, smoke/eval scripts, docs). The field always held normalised scales in [0, 1] — the old name was misleading. * Updated stale docstring + comment references to `topk_tau_start` / `topk_tau_end` (the old kwarg names were removed earlier in this PR). * Changed the final-state comment from "exactly K" to "at most K" to match the contract (the floor can reduce nz below K). * Added a docstring note for `min_nonzero_weight` calling out the `floor > 1/n_components` corner case (per-row fallback fires silently). Tests: * New: K=1 contract test (smallest cardinality; exercises the n_iter=1 branch of the iterative softmax — previously untested). * New: combined-lock-sum > 1.0 rejection. * New: `max_elements == n_locked` rejection (replaces the previous "== allowed" branch). * Total: 96 passing (+3 from this fixup). Ruff auto-formatted a handful of files (no logic changes). Two unrelated unused-import removals in `kmd_plus_test.py` / `finetune_inverse_heads_test.py` have been reverted to keep the PR scope tight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds three orthogonal per-element constraint knobs to
optimize_composition(thedifferentiable composition-space inverse-design entry point), plus a unified
single-knob annealing API for the new cardinality cap's iterative-softmax schedule.
The three new knobs
Implemented via a differentiable Plötz–Roth iterative-softmax K-hot mask
(not a post-hoc top-K projection) so the cardinality constraint participates in
the optimisation throughout via a temperature-annealing schedule.
absolute amounts (e.g. `{"Au": 0.65, "Ga": 0.20}`); the optimiser distributes
the remaining mass freely. Reuses the existing lock-paste machinery; does not
require `initial_weights` (values come straight from the kwarg).
`0 < w < floor` and redistribute the mass; per-row safe fallback when the
floor would empty a row's unlocked mass (preserves the simplex invariant).
The unified annealing API (A's schedule)
Three old kwargs (`topk_tau_start`, `topk_tau_end`, `topk_schedule`) collapsed
into two:
to `τ_start = 25**scale` (0 → 1, 0.5 → 5, 1 → 25). 0.5 is the safe
default calibrated against an 80-config sweep on the inverse-design model.
(`{step, tau, annealing_func}` parallel lists with per-segment normalised
scales and interpolation function). Tail falls back to default geometric
schedule when `step[-1] < 1.0`.
Behavioural evidence
Reproduced via scripts now tracked under `logs/` (outputs are git-ignored):
(`max_elements` enforces ≤ K; `fixed_amounts` holds exactly; floor
respected; combinations compose; row sums always = 1.0).
legitimate multi-objective trade-off (FE flat while klat improves with K
under fixed Au+Ga), not a bug.
existing 5 comp + 3 latent bars are byte-identical to before this PR.
See `docs/inverse_design_extension_notes.md` for the full code map.
Files changed
helpers (`_soft_topk_mask`, `_hard_topk_project`, `_apply_lock_paste`,
`_apply_min_floor`, `_tau_for_step`, `_scale_to_tau`, `_interp_scalar`),
validation block.
covering A, B, C, every pair, the full A+B+C stack, and every validation path.
Total: 93 passing (was 71).
`max_elements` / `annealing_scale` / `annealing_schedule` from each comp-config
row; three new comp rows (K=3, K=5, K=5 linear) added.
"implemented" with composition matrix + reproducibility cross-references.
Test plan
(numbers match standalone smoke runs).
contracts hold; multi-objective trade-off identified as the one
non-monotone case.
Notes on API choices
about `τ`); the underlying `τ = 25**scale` mapping is documented but not
surfaced in the parameter name.
is set (rather than `≥`) ensures lock-paste always has a free slot to absorb
the `1 − Σ fixed` leftover mass.
to either (i) breaking the simplex or (ii) silently relaxing the floor — the
user gets the simplex they asked for, with one position below floor only on
rows where the model genuinely has no above-floor alternative.
🤖 Generated with Claude Code