feat(inverse-design): cardinality cap + fixed-amount pinning + min-weight floor by TsumiNa · Pull Request #20 · TsumiNa/foundation_model

TsumiNa · 2026-05-26T07:47:56Z

Summary

Adds three orthogonal per-element constraint knobs to optimize_composition (the
differentiable composition-space inverse-design entry point), plus a unified
single-knob annealing API for the new cardinality cap's iterative-softmax schedule.

The three new knobs

A `max_elements: int | None` — at most K non-zero elements per recipe.
Implemented via a differentiable Plötz–Roth iterative-softmax K-hot mask
(not a post-hoc top-K projection) so the cardinality constraint participates in
the optimisation throughout via a temperature-annealing schedule.
B `fixed_amounts: {symbol: float}` — pin specific elements at user-given
absolute amounts (e.g. `{"Au": 0.65, "Ga": 0.20}`); the optimiser distributes
the remaining mass freely. Reuses the existing lock-paste machinery; does not
require `initial_weights` (values come straight from the kwarg).
C `min_nonzero_weight: float` — drop unlocked positions with
`0 < w < floor` and redistribute the mass; per-row safe fallback when the
floor would empty a row's unlocked mass (preserves the simplex invariant).

The unified annealing API (A's schedule)

Three old kwargs (`topk_tau_start`, `topk_tau_end`, `topk_schedule`) collapsed
into two:

`annealing_scale: float ∈ [0, 1]` (default 0.5) — single softness knob; maps
to `τ_start = 25**scale` (0 → 1, 0.5 → 5, 1 → 25). 0.5 is the safe
default calibrated against an 80-config sweep on the inverse-design model.
`annealing_schedule: dict` — advanced piecewise override
(`{step, tau, annealing_func}` parallel lists with per-segment normalised
scales and interpolation function). Tail falls back to default geometric
schedule when `step[-1] < 1.0`.

Behavioural evidence

Reproduced via scripts now tracked under `logs/` (outputs are git-ignored):

13/13 algorithm-contract checks pass across two paper scenarios
(`max_elements` enforces ≤ K; `fixed_amounts` holds exactly; floor
respected; combinations compose; row sums always = 1.0).
11/12 monotone-intuition checks pass. The single "failure" is a
legitimate multi-objective trade-off (FE flat while klat improves with K
under fixed Au+Ga), not a bug.
Three paper scenarios re-run with the new constraint bars added — the
existing 5 comp + 3 latent bars are byte-identical to before this PR.

See `docs/inverse_design_extension_notes.md` for the full code map.

Files changed

`src/foundation_model/models/flexible_multi_task_model.py` — three new kwargs,
helpers (`_soft_topk_mask`, `_hard_topk_project`, `_apply_lock_paste`,
`_apply_min_floor`, `_tau_for_step`, `_scale_to_tau`, `_interp_scalar`),
validation block.
`src/foundation_model/models/flexible_multi_task_model_test.py` — +21 tests
covering A, B, C, every pair, the full A+B+C stack, and every validation path.
Total: 93 passing (was 71).
`src/foundation_model/scripts/paper_inverse_comparison.py` — thread
`max_elements` / `annealing_scale` / `annealing_schedule` from each comp-config
row; three new comp rows (K=3, K=5, K=5 linear) added.
`docs/inverse_design_extension_notes.md` — rewritten from "future plans" to
"implemented" with composition matrix + reproducibility cross-references.
`docs/inverse_design_algorithms.md` — new kwargs added to the math doc.
`README.md`, `ARCHITECTURE.md` — brief pointer to the new constraint surface.
`logs/*.py` — reproducibility scripts (sweep, smoke, behavioural intuition).
`.gitignore` — exclude raw run artefacts under `logs/` while tracking the `.py`.

Test plan

Full test suite passes (`uv run pytest` — 93/93 green).
Ruff lint clean.
Three paper scenarios run end-to-end with the new comp bars added
(numbers match standalone smoke runs).
Behavioural sweep on real model (`logs/eval_abc_intuition.py`) — all
contracts hold; multi-objective trade-off identified as the one
non-monotone case.
Existing comp + latent bars byte-identical to pre-PR results.

Notes on API choices

`annealing_scale` is a deliberately opaque knob (users don't need to think
about `τ`); the underlying `τ = 25**scale` mapping is documented but not
surfaced in the parameter name.
A's strict requirement `max_elements > n_locked_total` when `fixed_amounts`
is set (rather than `≥`) ensures lock-paste always has a free slot to absorb
the `1 − Σ fixed` leftover mass.
C's per-row fallback (skip floor when it would empty unlocked) is preferable
to either (i) breaking the simplex or (ii) silently relaxing the floor — the
user gets the simplex they asked for, with one position below floor only on
rows where the model genuinely has no above-floor alternative.

🤖 Generated with Claude Code

…ight floor Adds three orthogonal per-element constraint knobs to optimize_composition, plus a single-knob unified annealing API for the cardinality cap's iterative-softmax schedule. A — max_elements (cardinality cap) Differentiable Plötz-Roth iterative soft top-K mask multiplied with softmax(logits), temperature-annealed during optimisation; final hard top-K projection guarantees the returned recipe is exactly K-hot (subject to floor C). Annealing API consolidated into: annealing_scale: float in [0, 1] (default 0.5; maps to tau_start = 25**scale, calibrated against an 80-config sweep on the inverse-design fine-tuned model) annealing_schedule: optional dict for advanced piecewise control ({step, tau, annealing_func} parallel lists; tail falls back to default geometric schedule if step[-1] < 1.0) B — fixed_amounts (user-pinned absolute amounts) Mapping[symbol, float] to pin specific elements at user-given values; reuses the existing lock-paste machinery used by element_step_scale=0 (validated disjoint). Unlike the step_scale=0 path, fixed_amounts does NOT require initial_weights — values come straight from the kwarg. C — min_nonzero_weight (trace-amount floor) Drops unlocked positions with 0 < w < floor and redistributes their mass to the remaining unlocked positions. Per-row safe fallback when all unlocked would drop: the row is left unfloored to keep the simplex invariant. Orthogonal composition — every pair and the full A+B+C stack is contract-tested and behaviourally validated on two paper scenarios. Validation rules enforce the few genuine impossibilities (e.g. fixed amounts below floor) up front. paper_inverse_comparison.py threads max_elements / annealing_scale / annealing_schedule from each comp-config row; three new comp rows added (K=3, K=5, K=5 linear) — the existing 5 comp + 3 latent bars are byte-identical to before. Documentation refreshed: * docs/inverse_design_extension_notes.md — rewritten from "future plans" to "implemented" with composition matrix + behavioural-evidence cross-references. * docs/inverse_design_algorithms.md — new kwargs added to the constraints and main-parameters tables. * README.md, ARCHITECTURE.md — short pointer to the new constraint surface. Reproducibility scripts under logs/ (.py files tracked, .json/.png/.log outputs git-ignored): * sweep_tau_schedule.py + plot_sweep.py — annealing calibration grid * test_max_elements_smoke.py — minimal A smoke test * eval_combined_abc.py + plot_combined_abc.py — pair/full-stack chart * eval_abc_intuition.py + plot_abc_intuition.py — behavioural sweep with PASS/FAIL intuition checks (this run: 13/13 contracts + 11/12 monotone- intuition checks; the single failure is a multi-objective trade-off, not a bug) Tests: 93 passing (was 71; +21 contract + validation tests covering each of A, B, C and their combinations + endpoints of the annealing knob). All existing tests still pass byte-identical numerically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR extends the inverse-design composition optimizer (FlexibleMultiTaskModel.optimize_composition) with three orthogonal per-element constraint knobs (cardinality cap, fixed-amount pinning, min-weight floor) and introduces a unified annealing API for the cardinality constraint’s iterative-softmax schedule. It also threads the new knobs through the paper comparison script, updates documentation, adds extensive tests, and adds reproducibility scripts under logs/ (with outputs git-ignored).

Changes:

Add max_elements, annealing_scale / annealing_schedule, fixed_amounts, and min_nonzero_weight to optimize_composition, plus helper logic for soft/hard top-K, lock-paste, and floor enforcement.
Add test coverage for A/B/C constraints, their combinations, and validation paths.
Update paper/evaluation scripts and docs to expose and explain the new constraint surface; adjust .gitignore to ignore generated logs/ artifacts.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/foundation_model/scripts/paper_inverse_comparison.py	Adds new composition config rows for `max_elements` and threads `max_elements` / `annealing_*` kwargs into `optimize_composition` calls.
src/foundation_model/models/flexible_multi_task_model.py	Implements constraint surface + annealing schedule logic inside `optimize_composition`.
src/foundation_model/models/flexible_multi_task_model_test.py	Adds extensive contract/validation tests for max-elements, fixed pinning, and floor behavior (alone and in combination).
README.md	Documents the new `optimize_composition` constraint knobs at a high level.
logs/test_max_elements_smoke.py	Adds a real-model smoke script to validate/max_elements behavior and trajectory annealing qualitatively.
logs/sweep_tau_schedule.py	Adds a sweep script (currently still uses removed `topk_*` kwargs; needs updating).
logs/plot_sweep.py	Plotting utility for the sweep JSON output.
logs/plot_combined_abc.py	Plotting utility for combined A/B/C evaluation results.
logs/plot_abc_intuition.py	Plotting utility for intuition-sweep JSON output.
logs/eval_combined_abc.py	End-to-end evaluation script for individual and combined A/B/C constraints.
logs/eval_abc_intuition.py	Behavioral sweep + intuition checks across scenarios for A/B/C and combinations.
docs/inverse_design_extension_notes.md	Updates the code map to reflect implemented A/B/C constraints and scheduling.
docs/inverse_design_algorithms.md	Updates the math/algorithm doc to include the new kwargs and constraints.
ARCHITECTURE.md	Adds a brief pointer to the new constraint knobs in the architecture overview.
.gitignore	Ignores generated `logs/*.json

Comments suppressed due to low confidence (1)

src/foundation_model/models/flexible_multi_task_model.py:3144

This comment still refers to topk_tau_start, but the implementation now derives τ from annealing_scale / annealing_schedule. Please update the wording to match the new API (e.g. “initial scoring uses the softest τ at step 0 of the annealing schedule”).

            # --- Record initial scores --------------------------------------------------------------
            # Initial scoring uses ``topk_tau_start`` so the t=0 view matches the softest end of
            # the annealing schedule (the optimisation actually starts there).
            current_tau[0] = _tau_for_step(0)

+            τ is annealed from ``topk_tau_start`` to ``topk_tau_end`` over ``steps``. The
+            annealing doubles as a continuation method that helps escape local optima.


+                else:
+                    locked_mask = locked_mask | fixed_mask_dev  # validated disjoint
+                    locked_w0 = locked_w0 + fixed_w0_batch
+


+            # a hard top-K projection so the returned ``optimized_weights`` has *exactly* K
+            # non-zero positions — at τ_end ≈ 0.01 the soft mask is already near-K-hot, so the
+            # projection just cleans up the residual sub-threshold weights.


+    if max_elements is not None:
+        kwargs.update(
+            max_elements=max_elements,
+            topk_tau_start=tau_start,
+            topk_tau_end=tau_end,
+            topk_schedule=schedule,
+        )
+    res = model.optimize_composition(kernel, **kwargs)


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e07cea92e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-26T07:52:51Z

+            if n_locked_pre > max_elements:
+                raise ValueError(
+                    f"max_elements={max_elements} is smaller than the number of hard-locked "
+                    f"elements ({n_locked_pre}, counting element_step_scale=0 ∪ fixed_amounts). "
+                    "Locked elements always count toward K — raise max_elements or unlock some."
+                )


Reject fully occupied K-hot lock sets with leftover mass

This validation allows max_elements == n_locked_pre for element_step_scale=0 locks, but that is invalid whenever the locked seed weights sum to less than 1. In that case the soft top-K path selects only locked indices (K - n_locked = 0), and the later lock-paste step has no unlocked support to absorb 1 - Σlocked, so returned rows can sum to < 1 instead of staying on the simplex. This is reachable with locked seeds like 0.2/0.2/0.2 and max_elements=3, and it silently distorts optimization by scaling descriptors down.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-26T07:52:51Z

+            topk_tau_start=tau_start,
+            topk_tau_end=tau_end,
+            topk_schedule=schedule,


Replace removed top-K kwargs in tau sweep script

The sweep script still calls optimize_composition with topk_tau_start, topk_tau_end, and topk_schedule, but this commit removed those parameters in favor of annealing_scale and annealing_schedule. As soon as max_elements is set, this call path now raises an unexpected-keyword TypeError, so the new reproducibility script cannot run.

Useful? React with 👍 / 👎.

External reviewers (Copilot + Codex) and self-review flagged 8 items. All addressed: Real bugs: * `element_step_scale=0` locks + `fixed_amounts` together could produce `locked_w0.sum > 1.0` and silently break the simplex. Added runtime check on the combined locked mass with a clear error message. * `max_elements == n_locked_pre` for the step_scale=0 path was previously allowed but silently produced row sums < 1 (the lock-paste had no unlocked slot for the leftover mass). Tightened to strict `max_elements > n_locked` for both lock sources, matching the existing fixed_amounts rule. New contract test pins the rejection. * `logs/sweep_tau_schedule.py` still used the removed `topk_*` kwargs. Adapted to the new annealing API: raw τ → normalised scale via `log(τ)/log(25)`, with schedule type expressed via `annealing_schedule` dict when not geometric. Wording / consistency: * Renamed `annealing_schedule["tau"]` → `["scale"]` everywhere (model code, validation messages, tests, paper-comparison config row, smoke/eval scripts, docs). The field always held normalised scales in [0, 1] — the old name was misleading. * Updated stale docstring + comment references to `topk_tau_start` / `topk_tau_end` (the old kwarg names were removed earlier in this PR). * Changed the final-state comment from "exactly K" to "at most K" to match the contract (the floor can reduce nz below K). * Added a docstring note for `min_nonzero_weight` calling out the `floor > 1/n_components` corner case (per-row fallback fires silently). Tests: * New: K=1 contract test (smallest cardinality; exercises the n_iter=1 branch of the iterative softmax — previously untested). * New: combined-lock-sum > 1.0 rejection. * New: `max_elements == n_locked` rejection (replaces the previous "== allowed" branch). * Total: 96 passing (+3 from this fixup). Ruff auto-formatted a handful of files (no logic changes). Two unrelated unused-import removals in `kmd_plus_test.py` / `finetune_inverse_heads_test.py` have been reverted to keep the PR scope tight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 26, 2026 07:47

Copilot started reviewing on behalf of TsumiNa May 26, 2026 07:48 View session

Copilot AI reviewed May 26, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 26, 2026

View reviewed changes

TsumiNa merged commit e5cca4a into master May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(inverse-design): cardinality cap + fixed-amount pinning + min-weight floor#20

feat(inverse-design): cardinality cap + fixed-amount pinning + min-weight floor#20
TsumiNa merged 2 commits into
masterfrom
feat/composition-constraints-abc

TsumiNa commented May 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 26, 2026

Uh oh!

chatgpt-codex-connector Bot May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		τ is annealed from ``topk_tau_start`` to ``topk_tau_end`` over ``steps``. The
		annealing doubles as a continuation method that helps escape local optima.

Conversation

TsumiNa commented May 26, 2026

Summary

The three new knobs

The unified annealing API (A's schedule)

Behavioural evidence

Files changed

Test plan

Notes on API choices

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants