Skip to content

[REFACTOR][MANIFEST] drop generative-op carrier-scalar; allow empty signature.inputs #1440

@lcy-seso

Description

@lcy-seso

Description

Symptom / Motivation

The current manifest schema enforces len(signature.inputs) >= 1 via test_every_signature_has_inputs_and_outputs. Two reduction-family ops (AlibiFwdOp, SinusoidalFwdOp) synthesize their output entirely from construction-time scalar parameters — they take zero tensor inputs. To satisfy the schema, both manifest entries declare a fake device_carrier 0-dim scalar tensor as their sole input; the carrier exists only as a workaround. This same carrier is also threaded through _register_generative_custom_op and the validator carries a is_generative carve-out that detects the pattern via ref_api == "none" + 0 positional args and skips L1/C4 alignment checks. The hack documents itself in the manifest YAML preamble; the design doc previously enshrined it as well (removed in #1439).

The real defect is one schema invariant being too strict, not a missing "generative op kind". Every op has inputs — they are not always tensors. ALiBi and sinusoidal have integer + dtype params as their real inputs.

Root Cause Analysis

  • scripts/validate_manifest.py:test_every_signature_has_inputs_and_outputs requires len(signature.inputs) >= 1. Generative ops physically violate this.
  • scripts/validate_manifest.py threads is_generative through check_l1 and L4 (lines around 542, 836-862, 3398-3408) to skip the inputs↔forward positional alignment. The flag is derived from a ref_api == "none" + 0 positional heuristic.
  • tileops/ops/elementwise/_base.py:_register_generative_custom_op registers the op with device_carrier: Tensor as its first arg so torch.compile / register_fake has a tensor to trace.
  • tileops/ops/elementwise/alibi.py and tileops/ops/elementwise/sinusoidal.py allocate self._device_carrier = torch.empty((), dtype=dtype, device="cuda") in __init__ and pass it to the custom op call in forward().
  • tileops/manifest/elementwise_generative.yaml declares device_carrier as the sole input on both entries; output dtype uses same_as(device_carrier) to channel dtype through the carrier.

The cleanup vector is: relax the schema invariant, drop the carrier and its plumbing, simplify the validator.

Related Files

  • scripts/validate_manifest.py
  • tileops/ops/elementwise/_base.py
  • tileops/ops/elementwise/alibi.py
  • tileops/ops/elementwise/sinusoidal.py
  • tileops/manifest/elementwise_generative.yaml
  • tests/test_validate_manifest.py
  • tests/ops/test_special_elementwise.py

Goal

Replace the carrier-scalar hack with the minimum-correct implementation that allows signature.inputs: {} for ops whose output is fully derived from construction-time params. After this change, no "generative-op" carve-out concept exists anywhere — the schema is permissive enough that the alignment checks become natural no-ops on empty input lists.

Plan

  1. Relax scripts/validate_manifest.py:test_every_signature_has_inputs_and_outputs from len(signature.inputs) >= 1 to len(signature.outputs) >= 1 AND (len(signature.inputs) >= 1 OR len(signature.params) >= 1). Add a test that an op with inputs: {} plus non-empty params is accepted.
  2. Remove the is_generative parameter from check_l1 and L4 in scripts/validate_manifest.py. Remove the ref_api == "none" + 0 positional heuristic detection (around lines 836-862 and 3398-3408). The inputs↔forward positional alignment loop becomes a natural no-op when inputs == [].
  3. Remove all "Generative-op carve-out" comments from scripts/validate_manifest.py.
  4. Update tileops/manifest/elementwise_generative.yaml: change inputs: { device_carrier: ... } to inputs: {} on both entries. Promote dtype from a fake same_as(device_carrier) channel to an explicit params: dtype: {type: torch.dtype} entry. Update output dtype expression accordingly (e.g., same_as(dtype) or direct reference). Delete the file's preamble comment block describing the carve-out.
  5. Replace tileops/ops/elementwise/_base.py:_register_generative_custom_op with a tensor-input-less impl. Two acceptable approaches: (a) drop the custom_op registration entirely and have forward() call the kernel directly (eager-only; the ops cannot be torch.compile-graph-captured); or (b) register via torch.library.impl directly without going through dispatcher tracing. Pick (a) for this PR unless graph capture is required by a downstream consumer.
  6. Delete self._device_carrier allocation in tileops/ops/elementwise/alibi.py:__init__ and tileops/ops/elementwise/sinusoidal.py:__init__. Simplify forward() to call the kernel directly without passing a carrier.
  7. Delete validator tests that assert the carve-out fires (in tests/test_validate_manifest.py); add one test confirming inputs: {} is accepted by the relaxed invariant.
  8. Run pytest tests/ops/test_special_elementwise.py to confirm AlibiFwdOp / SinusoidalFwdOp still produce correct output without the carrier path.
  9. Run python scripts/validate_manifest.py and pytest tests/test_validate_manifest.py — both pass.

Constraints

  • Joint manifest + validator + ops change. The schema relaxation and the impl changes that depend on it land in the same PR — splitting them would leave the manifest mid-state for one round.
  • MUST NOT touch other manifest entries that currently use ref_api: "none" with real tensor inputs (the 11 attention.yaml entries and 3 elementwise_fused_gated.yaml entries). They are not affected.
  • MUST NOT introduce a new "generative op kind" concept anywhere. The fix is schema relaxation, not categorization.
  • MUST keep AlibiFwdOp / SinusoidalFwdOp working end-to-end. tests/ops/test_special_elementwise.py must stay green.
  • Dropping torch.compile graph capture for ALiBi / sinusoidal is acceptable in this PR. If a downstream consumer requires it, file a separate follow-up; do not solve it here with another carrier-style placeholder.

Acceptance Criteria

  • Modified files pass unit tests (pytest tests/test_validate_manifest.py tests/ops/test_special_elementwise.py, all green on CUDA).
  • python scripts/validate_manifest.py exits 0.
  • grep -rn 'device_carrier\|_register_generative_custom_op\|is_generative\|Generative-op carve-out\|generative-op carve-out' tileops/ scripts/ tests/ returns no matches (every trace of the hack is gone).
  • grep -rn 'signature.inputs' scripts/validate_manifest.py tests/test_validate_manifest.py shows no >= 1 invariant; the relaxed outputs >= 1 AND (inputs >= 1 OR params >= 1) is in place.
  • tileops/manifest/elementwise_generative.yaml declares inputs: {} on AlibiFwdOp and SinusoidalFwdOp; dtype is a param; the file preamble has no carve-out narrative.
  • A new test test_signature_inputs_may_be_empty_when_params_present (or equivalent) is added to tests/test_validate_manifest.py and passes.

Metadata

Metadata

Assignees

Labels

refactorCode restructuring without behavior change

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions