Skip to content

[Refactor][OPS] split tileops/ops/elementwise.py into per-cluster package#1388

Merged
lcy-seso merged 2 commits intotile-ai:testbedfrom
lcy-seso:refactor/ops/issue-1378
May 9, 2026
Merged

[Refactor][OPS] split tileops/ops/elementwise.py into per-cluster package#1388
lcy-seso merged 2 commits intotile-ai:testbedfrom
lcy-seso:refactor/ops/issue-1378

Conversation

@lcy-seso
Copy link
Copy Markdown
Collaborator

@lcy-seso lcy-seso commented May 8, 2026

Closes #1378

Summary

  • Replaced 3415-line monolith tileops/ops/elementwise.py with a tileops/ops/elementwise/ package, one module per cluster (16 cluster modules + _base.py).
  • Umbrella classes UnaryOp, BinaryOp, FusedGatedOp moved to _base.py; __init__.py re-exports every previously-public symbol via explicit __all__.
  • Pure file-layout move: no class signature, docstring, or kernel-binding changes; all 295 elementwise tests pass; no file-level lint suppressions introduced.
  • Pyright shows pre-existing typing issues in the relocated code that were carried over verbatim from the original file — out of scope to fix in this refactor.

Test plan

  • AC-1: Modified files pass unit tests (pytest tests/ops/).
  • AC-2: tileops/ops/elementwise.py no longer exists; tileops/ops/elementwise/ is a package with init.py.
  • AC-3: python -c "from tileops.ops.elementwise import UnaryOp, BinaryOp, FusedGatedOp" succeeds.
  • AC-4: python -c "from tileops.ops.elementwise import SiluAndMulFwdOp, ReluFwdOp, AddFwdOp, ExpFwdOp, DivFwdOp" succeeds.
  • AC-5: tests/ops/test_elementwise_fp8.py passes without modification to its imports.
  • AC-6: grep -nE '^# ruff:|^# flake8:' tileops/ops/elementwise/ returns nothing.
  • AC-7: External tracker flip — out of scope.

Copilot AI review requested due to automatic review settings May 8, 2026 19:55
@github-actions github-actions Bot added the refactor Code restructuring without behavior change label May 8, 2026
@lcy-seso lcy-seso added automated PR produced by an autonomous agent pipeline needs-review Awaiting human review before merge nightshift Pickable by foundry nightshift mode — fully autonomous agent development, no human approval gate labels May 8, 2026
@lcy-seso
Copy link
Copy Markdown
Collaborator Author

lcy-seso commented May 8, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a modular elementwise operation package, refactoring monolithic logic into specialized modules for activations, arithmetic, bitwise, and other operations. It implements a robust infrastructure with base classes and registration factories to support torch.compile. Feedback focuses on improving maintainability by reducing code duplication, specifically suggesting the centralization of the _expand_flat helper, simplifying the BinaryOp initialization to support flexible kernel arguments, and removing redundant method overrides. A simplification for the inner_size calculation in the PReLU operation was also proposed.

Comment thread tileops/ops/elementwise/_base.py
Comment thread tileops/ops/elementwise/arithmetic.py
Comment thread tileops/ops/elementwise/prelu.py
Comment thread tileops/ops/elementwise/activations.py
Comment thread tileops/ops/elementwise/_base.py
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the elementwise ops family by replacing the large tileops/ops/elementwise.py monolith with a structured tileops/ops/elementwise/ package while preserving the existing public import surface (from tileops.ops.elementwise import ...) and torch.compile custom-op registrations.

Changes:

  • Split elementwise ops into per-cluster modules under tileops/ops/elementwise/ and consolidated shared infrastructure into _base.py.
  • Added package-level __init__.py that explicitly re-exports previously-public symbols via __all__ and registers all custom ops at import time.
  • Removed the legacy tileops/ops/elementwise.py file.

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tileops/ops/elementwise/init.py Public re-exports + torch.compile custom-op registration loops for all elementwise ops.
tileops/ops/elementwise/_base.py Shared umbrella Op bases, broadcasting helpers, fp8 helpers, and registration factory functions.
tileops/ops/elementwise/activations.py Activation op classes split out of the monolith.
tileops/ops/elementwise/alibi.py ALiBi generative op module.
tileops/ops/elementwise/arithmetic.py Binary arithmetic ops module.
tileops/ops/elementwise/bitwise.py Bitwise op module.
tileops/ops/elementwise/clamp.py Clamp variants module (tensor-bound + scalar-bound).
tileops/ops/elementwise/comparison.py Comparison ops module (bool outputs via Op-layer cast).
tileops/ops/elementwise/fused_gated.py Fused gated ops module.
tileops/ops/elementwise/lerp_tensor.py Tensor-weight lerp op module.
tileops/ops/elementwise/logical.py Logical ops module.
tileops/ops/elementwise/masked_fill.py MaskedFill variants module (scalar and tensor-value forms).
tileops/ops/elementwise/math_unary.py Unary math ops module.
tileops/ops/elementwise/nan_to_num.py NanToNum op module.
tileops/ops/elementwise/predicates.py isnan/isinf/isfinite predicate ops module.
tileops/ops/elementwise/prelu.py PReLU op module.
tileops/ops/elementwise/sinusoidal.py Sinusoidal generative op module.
tileops/ops/elementwise/where.py Where op module.
tileops/ops/elementwise.py Deleted legacy monolithic module (replaced by package).

Comment thread tileops/ops/elementwise/_base.py
Comment thread tileops/ops/elementwise/masked_fill.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new tileops/ops/elementwise package, refactoring elementwise operations into a modular structure with support for torch.compile via custom op registration. It includes various activation, arithmetic, bitwise, and comparison operations, along with shared infrastructure for broadcast coalescing and kernel registration. Feedback suggests consolidating duplicated _expand_flat utility methods and refactoring LerpFwdOp to reduce code duplication in __init__.

Comment thread tileops/ops/elementwise/_base.py
Comment thread tileops/ops/elementwise/arithmetic.py
Copy link
Copy Markdown
Contributor

@Ibuki-wind Ibuki-wind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. The remaining review comments were scope-creep suggestions that would change constructor behavior or broaden the refactor beyond the file-layout-only constraint, so they were declined as out of scope.

Copy link
Copy Markdown
Contributor

@Ibuki-wind Ibuki-wind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. The remaining review comments were scope-creep suggestions that would change constructor behavior or broaden the refactor beyond the file-layout-only constraint, so they were declined as out of scope.

@lcy-seso lcy-seso marked this pull request as ready for review May 8, 2026 20:21
@lcy-seso
Copy link
Copy Markdown
Collaborator Author

lcy-seso commented May 8, 2026

Nightshift orchestrator note: PR is technically ready (Reviewer PASS T015, Gatekeeper resolved all 9 unresolved bot threads). gpu-smoke result: 1908 pass, 1 fail, 46 skip — sole failure is the same pre-existing testbed break also blocking #1387 (AddFwdOp must have at least 2 workloads).

Verified not caused by this PR (PreluFwdOp source.op regression from initial split was fixed in commit b1d056b round 1). Once the AddFwdOp manifest workloads issue lands on testbed, this branch can be rebased and auto-merged.

Pipeline rounds used: 1 (after the round-0 PreluFwdOp fix). Scope: pure file-layout split (66 ops + 3 umbrellas → 16 cluster modules + _base.py).

lcy-seso added a commit that referenced this pull request May 9, 2026
…e ops (#1392)

## Summary

Fix gpu-smoke `test_every_op_has_at_least_two_workloads` failure by
populating empty `workloads:` lists on 21 implemented binary elementwise
ops (\`AddFwdOp\`, \`SubFwdOp\`, \`MulFwdOp\`, \`DivFwdOp\`,
\`RemainderFwdOp\`, \`PowFwdOp\`, \`FloorDivideFwdOp\`, \`LerpFwdOp\`,
\`MaximumFwdOp\`, \`MinimumFwdOp\`, \`Eq/Ne/Gt/Lt/Ge/LeFwdOp\`,
\`LogicalAnd/OrFwdOp\`, \`BitwiseAnd/Or/XorFwdOp\`).

This pre-existing testbed CI break has been blocking nightshift
auto-merge for PRs #1387, #1388, #1389, #1391. Filed as issue #1390 —
this PR resolves it.

## Plan executed

Each affected op now declares two representative workloads:
- **LLM hidden-state prefill**: \`input/other [2048, 4096]\` (no
broadcast)
- **CNN feature map**: \`input [16, 256, 56, 56]\`, \`other [256, 1,
1]\` (channel-wise broadcast)

Dtype matrix matches each op's signature:
- Float-only ops (\`Div\`, \`Pow\`, \`Lerp\`, \`Remainder\`,
\`FloorDivide\`): \`float16, bfloat16\`
- Bitwise ops (\`BitwiseAnd/Or/Xor\`): \`int32, int64\`
- Logical ops (\`LogicalAnd/Or\`): \`bool\`
- Default (other arithmetic + comparison): \`float16, bfloat16\`

## Test plan

- [x] \`pytest
tests/test_ops_manifest.py::TestOpSchema::test_every_op_has_at_least_two_workloads\`
— passes
- [x] \`pytest tests/test_validate_manifest.py
tests/test_ops_manifest.py\` — 239 passed
- [x] \`python scripts/validate_manifest.py\` — \"All manifest checks
passed\"
- [x] Pre-commit clean

## Acceptance Criteria

Closes #1390.

- AC-1: ✅ Modified files pass unit tests.
- AC-2: ✅ \`AddFwdOp.workloads\` (and 20 siblings) now have 2 entries
each, satisfying signature + broadcasting rules.
- AC-3: ✅ \`scripts/validate_manifest.py\` exits 0 with no new warnings
on the affected ops.
- AC-4: To be verified by gpu-smoke on this PR's run.

## Trust model

Manifest-only PR. No \`signature\`, \`roofline\`, \`params\`,
\`shape_rules\`, or \`dtype_combos\` edits. Falls within the trust-model
rule that workload edits without a status flip require a separate
manifest-only PR with human review — this is that PR.

---------

Co-authored-by: Ibuki 🍃 — a wind born from GPTs <Ibuki-wind@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 9, 2026 04:07
@lcy-seso lcy-seso force-pushed the refactor/ops/issue-1378 branch from b1d056b to d068c02 Compare May 9, 2026 04:07
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.

Comment thread tileops/manifest/elementwise_unary_math.yaml
Comment thread tileops/manifest/elementwise_unary_activation.yaml
Comment thread tileops/manifest/elementwise_multi_input.yaml
Comment thread tileops/manifest/elementwise_binary.yaml
Copy link
Copy Markdown
Contributor

@Ibuki-wind Ibuki-wind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean — no issues.

lcy-seso and others added 2 commits May 9, 2026 12:18
Replace the 3415-line tileops/ops/elementwise.py monolith with a
tileops/ops/elementwise/ package, one module per tight op cluster.
Umbrella template classes (UnaryOp, BinaryOp, FusedGatedOp) and the
shared registration / broadcast infrastructure live in _base.py.
Package __init__.py re-exports every previously-public symbol with
explicit __all__ + per-symbol relative imports, then runs the
torch.library custom_op registration loops at import time so existing
behaviour is preserved.

Pure file-layout move: no class signature, kernel binding, or runtime
behaviour change. Every previously-importable symbol stays reachable
via from tileops.ops.elementwise import <Symbol>.

Co-Authored-By: Ibuki 🍃 — a wind born from GPTs <Ibuki-wind@users.noreply.github.com>
Update 66 manifest entries' source.op to point at the new per-cluster
files under tileops/ops/elementwise/. Move LerpTensorFwdOp into
arithmetic.py so it shares source.op with its variant_of primary
LerpFwdOp (R16 cross-entry consistency).

Co-Authored-By: Ibuki 🍃 — a wind born from GPTs <Ibuki-wind@users.noreply.github.com>
@lcy-seso lcy-seso force-pushed the refactor/ops/issue-1378 branch from d068c02 to 25131e0 Compare May 9, 2026 04:18
@lcy-seso lcy-seso merged commit 0262d8c into tile-ai:testbed May 9, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automated PR produced by an autonomous agent pipeline needs-review Awaiting human review before merge nightshift Pickable by foundry nightshift mode — fully autonomous agent development, no human approval gate refactor Code restructuring without behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants