Skip to content

feat(inverse-design): cardinality cap + fixed-amount pinning + min-weight floor#20

Merged
TsumiNa merged 2 commits into
masterfrom
feat/composition-constraints-abc
May 26, 2026
Merged

feat(inverse-design): cardinality cap + fixed-amount pinning + min-weight floor#20
TsumiNa merged 2 commits into
masterfrom
feat/composition-constraints-abc

Conversation

@TsumiNa
Copy link
Copy Markdown
Owner

@TsumiNa TsumiNa commented May 26, 2026

Summary

Adds three orthogonal per-element constraint knobs to optimize_composition (the
differentiable composition-space inverse-design entry point), plus a unified
single-knob annealing API for the new cardinality cap's iterative-softmax schedule.

The three new knobs

  • A `max_elements: int | None` — at most K non-zero elements per recipe.
    Implemented via a differentiable Plötz–Roth iterative-softmax K-hot mask
    (not a post-hoc top-K projection) so the cardinality constraint participates in
    the optimisation throughout via a temperature-annealing schedule.
  • B `fixed_amounts: {symbol: float}` — pin specific elements at user-given
    absolute amounts (e.g. `{"Au": 0.65, "Ga": 0.20}`); the optimiser distributes
    the remaining mass freely. Reuses the existing lock-paste machinery; does not
    require `initial_weights`
    (values come straight from the kwarg).
  • C `min_nonzero_weight: float` — drop unlocked positions with
    `0 < w < floor` and redistribute the mass; per-row safe fallback when the
    floor would empty a row's unlocked mass (preserves the simplex invariant).

The unified annealing API (A's schedule)

Three old kwargs (`topk_tau_start`, `topk_tau_end`, `topk_schedule`) collapsed
into two:

  • `annealing_scale: float ∈ [0, 1]` (default 0.5) — single softness knob; maps
    to `τ_start = 25**scale` (0 → 1, 0.5 → 5, 1 → 25). 0.5 is the safe
    default calibrated against an 80-config sweep on the inverse-design model.
  • `annealing_schedule: dict` — advanced piecewise override
    (`{step, tau, annealing_func}` parallel lists with per-segment normalised
    scales and interpolation function). Tail falls back to default geometric
    schedule when `step[-1] < 1.0`.

Behavioural evidence

Reproduced via scripts now tracked under `logs/` (outputs are git-ignored):

  • 13/13 algorithm-contract checks pass across two paper scenarios
    (`max_elements` enforces ≤ K; `fixed_amounts` holds exactly; floor
    respected; combinations compose; row sums always = 1.0).
  • 11/12 monotone-intuition checks pass. The single "failure" is a
    legitimate multi-objective trade-off (FE flat while klat improves with K
    under fixed Au+Ga), not a bug.
  • Three paper scenarios re-run with the new constraint bars added — the
    existing 5 comp + 3 latent bars are byte-identical to before this PR.

See `docs/inverse_design_extension_notes.md` for the full code map.

Files changed

  • `src/foundation_model/models/flexible_multi_task_model.py` — three new kwargs,
    helpers (`_soft_topk_mask`, `_hard_topk_project`, `_apply_lock_paste`,
    `_apply_min_floor`, `_tau_for_step`, `_scale_to_tau`, `_interp_scalar`),
    validation block.
  • `src/foundation_model/models/flexible_multi_task_model_test.py` — +21 tests
    covering A, B, C, every pair, the full A+B+C stack, and every validation path.
    Total: 93 passing (was 71).
  • `src/foundation_model/scripts/paper_inverse_comparison.py` — thread
    `max_elements` / `annealing_scale` / `annealing_schedule` from each comp-config
    row; three new comp rows (K=3, K=5, K=5 linear) added.
  • `docs/inverse_design_extension_notes.md` — rewritten from "future plans" to
    "implemented" with composition matrix + reproducibility cross-references.
  • `docs/inverse_design_algorithms.md` — new kwargs added to the math doc.
  • `README.md`, `ARCHITECTURE.md` — brief pointer to the new constraint surface.
  • `logs/*.py` — reproducibility scripts (sweep, smoke, behavioural intuition).
  • `.gitignore` — exclude raw run artefacts under `logs/` while tracking the `.py`.

Test plan

  • Full test suite passes (`uv run pytest` — 93/93 green).
  • Ruff lint clean.
  • Three paper scenarios run end-to-end with the new comp bars added
    (numbers match standalone smoke runs).
  • Behavioural sweep on real model (`logs/eval_abc_intuition.py`) — all
    contracts hold; multi-objective trade-off identified as the one
    non-monotone case.
  • Existing comp + latent bars byte-identical to pre-PR results.

Notes on API choices

  • `annealing_scale` is a deliberately opaque knob (users don't need to think
    about `τ`); the underlying `τ = 25**scale` mapping is documented but not
    surfaced in the parameter name.
  • A's strict requirement `max_elements > n_locked_total` when `fixed_amounts`
    is set (rather than `≥`) ensures lock-paste always has a free slot to absorb
    the `1 − Σ fixed` leftover mass.
  • C's per-row fallback (skip floor when it would empty unlocked) is preferable
    to either (i) breaking the simplex or (ii) silently relaxing the floor — the
    user gets the simplex they asked for, with one position below floor only on
    rows where the model genuinely has no above-floor alternative.

🤖 Generated with Claude Code

…ight floor

Adds three orthogonal per-element constraint knobs to optimize_composition, plus a
single-knob unified annealing API for the cardinality cap's iterative-softmax schedule.

A — max_elements (cardinality cap)
  Differentiable Plötz-Roth iterative soft top-K mask multiplied with softmax(logits),
  temperature-annealed during optimisation; final hard top-K projection guarantees the
  returned recipe is exactly K-hot (subject to floor C).

  Annealing API consolidated into:
    annealing_scale: float in [0, 1] (default 0.5; maps to tau_start = 25**scale,
      calibrated against an 80-config sweep on the inverse-design fine-tuned model)
    annealing_schedule: optional dict for advanced piecewise control
      ({step, tau, annealing_func} parallel lists; tail falls back to default
      geometric schedule if step[-1] < 1.0)

B — fixed_amounts (user-pinned absolute amounts)
  Mapping[symbol, float] to pin specific elements at user-given values; reuses the
  existing lock-paste machinery used by element_step_scale=0 (validated disjoint).
  Unlike the step_scale=0 path, fixed_amounts does NOT require initial_weights —
  values come straight from the kwarg.

C — min_nonzero_weight (trace-amount floor)
  Drops unlocked positions with 0 < w < floor and redistributes their mass to the
  remaining unlocked positions. Per-row safe fallback when all unlocked would drop:
  the row is left unfloored to keep the simplex invariant.

Orthogonal composition — every pair and the full A+B+C stack is contract-tested and
behaviourally validated on two paper scenarios. Validation rules enforce the few
genuine impossibilities (e.g. fixed amounts below floor) up front.

paper_inverse_comparison.py threads max_elements / annealing_scale / annealing_schedule
from each comp-config row; three new comp rows added (K=3, K=5, K=5 linear) — the
existing 5 comp + 3 latent bars are byte-identical to before.

Documentation refreshed:
  * docs/inverse_design_extension_notes.md — rewritten from "future plans" to
    "implemented" with composition matrix + behavioural-evidence cross-references.
  * docs/inverse_design_algorithms.md — new kwargs added to the constraints and
    main-parameters tables.
  * README.md, ARCHITECTURE.md — short pointer to the new constraint surface.

Reproducibility scripts under logs/ (.py files tracked, .json/.png/.log outputs
git-ignored):
  * sweep_tau_schedule.py + plot_sweep.py — annealing calibration grid
  * test_max_elements_smoke.py — minimal A smoke test
  * eval_combined_abc.py + plot_combined_abc.py — pair/full-stack chart
  * eval_abc_intuition.py + plot_abc_intuition.py — behavioural sweep with
    PASS/FAIL intuition checks (this run: 13/13 contracts + 11/12 monotone-
    intuition checks; the single failure is a multi-objective trade-off, not a bug)

Tests: 93 passing (was 71; +21 contract + validation tests covering each of A, B,
C and their combinations + endpoints of the annealing knob). All existing tests
still pass byte-identical numerically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 26, 2026 07:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the inverse-design composition optimizer (FlexibleMultiTaskModel.optimize_composition) with three orthogonal per-element constraint knobs (cardinality cap, fixed-amount pinning, min-weight floor) and introduces a unified annealing API for the cardinality constraint’s iterative-softmax schedule. It also threads the new knobs through the paper comparison script, updates documentation, adds extensive tests, and adds reproducibility scripts under logs/ (with outputs git-ignored).

Changes:

  • Add max_elements, annealing_scale / annealing_schedule, fixed_amounts, and min_nonzero_weight to optimize_composition, plus helper logic for soft/hard top-K, lock-paste, and floor enforcement.
  • Add test coverage for A/B/C constraints, their combinations, and validation paths.
  • Update paper/evaluation scripts and docs to expose and explain the new constraint surface; adjust .gitignore to ignore generated logs/ artifacts.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/foundation_model/scripts/paper_inverse_comparison.py Adds new composition config rows for max_elements and threads max_elements / annealing_* kwargs into optimize_composition calls.
src/foundation_model/models/flexible_multi_task_model.py Implements constraint surface + annealing schedule logic inside optimize_composition.
src/foundation_model/models/flexible_multi_task_model_test.py Adds extensive contract/validation tests for max-elements, fixed pinning, and floor behavior (alone and in combination).
README.md Documents the new optimize_composition constraint knobs at a high level.
logs/test_max_elements_smoke.py Adds a real-model smoke script to validate/max_elements behavior and trajectory annealing qualitatively.
logs/sweep_tau_schedule.py Adds a sweep script (currently still uses removed topk_* kwargs; needs updating).
logs/plot_sweep.py Plotting utility for the sweep JSON output.
logs/plot_combined_abc.py Plotting utility for combined A/B/C evaluation results.
logs/plot_abc_intuition.py Plotting utility for intuition-sweep JSON output.
logs/eval_combined_abc.py End-to-end evaluation script for individual and combined A/B/C constraints.
logs/eval_abc_intuition.py Behavioral sweep + intuition checks across scenarios for A/B/C and combinations.
docs/inverse_design_extension_notes.md Updates the code map to reflect implemented A/B/C constraints and scheduling.
docs/inverse_design_algorithms.md Updates the math/algorithm doc to include the new kwargs and constraints.
ARCHITECTURE.md Adds a brief pointer to the new constraint knobs in the architecture overview.
.gitignore Ignores generated `logs/*.json
Comments suppressed due to low confidence (1)

src/foundation_model/models/flexible_multi_task_model.py:3144

  • This comment still refers to topk_tau_start, but the implementation now derives τ from annealing_scale / annealing_schedule. Please update the wording to match the new API (e.g. “initial scoring uses the softest τ at step 0 of the annealing schedule”).
            # --- Record initial scores --------------------------------------------------------------
            # Initial scoring uses ``topk_tau_start`` so the t=0 view matches the softest end of
            # the annealing schedule (the optimisation actually starts there).
            current_tau[0] = _tau_for_step(0)

Comment on lines +2405 to +2406
τ is annealed from ``topk_tau_start`` to ``topk_tau_end`` over ``steps``. The
annealing doubles as a continuation method that helps escape local optima.
else:
locked_mask = locked_mask | fixed_mask_dev # validated disjoint
locked_w0 = locked_w0 + fixed_w0_batch

Comment on lines +3185 to +3187
# a hard top-K projection so the returned ``optimized_weights`` has *exactly* K
# non-zero positions — at τ_end ≈ 0.01 the soft mask is already near-K-hot, so the
# projection just cleans up the residual sub-threshold weights.
Comment on lines +98 to +105
if max_elements is not None:
kwargs.update(
max_elements=max_elements,
topk_tau_start=tau_start,
topk_tau_end=tau_end,
topk_schedule=schedule,
)
res = model.optimize_composition(kernel, **kwargs)
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e07cea92e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2704 to +2709
if n_locked_pre > max_elements:
raise ValueError(
f"max_elements={max_elements} is smaller than the number of hard-locked "
f"elements ({n_locked_pre}, counting element_step_scale=0 ∪ fixed_amounts). "
"Locked elements always count toward K — raise max_elements or unlock some."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject fully occupied K-hot lock sets with leftover mass

This validation allows max_elements == n_locked_pre for element_step_scale=0 locks, but that is invalid whenever the locked seed weights sum to less than 1. In that case the soft top-K path selects only locked indices (K - n_locked = 0), and the later lock-paste step has no unlocked support to absorb 1 - Σlocked, so returned rows can sum to < 1 instead of staying on the simplex. This is reachable with locked seeds like 0.2/0.2/0.2 and max_elements=3, and it silently distorts optimization by scaling descriptors down.

Useful? React with 👍 / 👎.

Comment thread logs/sweep_tau_schedule.py Outdated
Comment on lines +101 to +103
topk_tau_start=tau_start,
topk_tau_end=tau_end,
topk_schedule=schedule,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Replace removed top-K kwargs in tau sweep script

The sweep script still calls optimize_composition with topk_tau_start, topk_tau_end, and topk_schedule, but this commit removed those parameters in favor of annealing_scale and annealing_schedule. As soon as max_elements is set, this call path now raises an unexpected-keyword TypeError, so the new reproducibility script cannot run.

Useful? React with 👍 / 👎.

External reviewers (Copilot + Codex) and self-review flagged 8 items. All addressed:

Real bugs:
* `element_step_scale=0` locks + `fixed_amounts` together could produce
  `locked_w0.sum > 1.0` and silently break the simplex. Added runtime check on
  the combined locked mass with a clear error message.
* `max_elements == n_locked_pre` for the step_scale=0 path was previously
  allowed but silently produced row sums < 1 (the lock-paste had no unlocked
  slot for the leftover mass). Tightened to strict `max_elements > n_locked`
  for both lock sources, matching the existing fixed_amounts rule. New
  contract test pins the rejection.
* `logs/sweep_tau_schedule.py` still used the removed `topk_*` kwargs. Adapted
  to the new annealing API: raw τ → normalised scale via `log(τ)/log(25)`, with
  schedule type expressed via `annealing_schedule` dict when not geometric.

Wording / consistency:
* Renamed `annealing_schedule["tau"]` → `["scale"]` everywhere (model code,
  validation messages, tests, paper-comparison config row, smoke/eval scripts,
  docs). The field always held normalised scales in [0, 1] — the old name was
  misleading.
* Updated stale docstring + comment references to `topk_tau_start` /
  `topk_tau_end` (the old kwarg names were removed earlier in this PR).
* Changed the final-state comment from "exactly K" to "at most K" to match
  the contract (the floor can reduce nz below K).
* Added a docstring note for `min_nonzero_weight` calling out the
  `floor > 1/n_components` corner case (per-row fallback fires silently).

Tests:
* New: K=1 contract test (smallest cardinality; exercises the n_iter=1 branch
  of the iterative softmax — previously untested).
* New: combined-lock-sum > 1.0 rejection.
* New: `max_elements == n_locked` rejection (replaces the previous "==
  allowed" branch).
* Total: 96 passing (+3 from this fixup).

Ruff auto-formatted a handful of files (no logic changes). Two unrelated
unused-import removals in `kmd_plus_test.py` / `finetune_inverse_heads_test.py`
have been reverted to keep the PR scope tight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TsumiNa TsumiNa merged commit e5cca4a into master May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants