Enhance documentation for tile operations in the ISA#108
Conversation
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the ISA documentation for communication, system, and tile instructions by adding detailed mechanism descriptions, input/output tables, performance models, and exception handling. The review feedback highlights several inconsistencies where the IR syntax examples do not match the C++ intrinsic signatures or the operation descriptions, particularly regarding missing event return values in the SSA and DPS forms. Additionally, corrections are required for platform-specific support details in the TPOP and TPUSH documentation to accurately reflect A5-only features.
zhoubot
left a comment
There was a problem hiding this comment.
Thanks for the documentation expansion. I reviewed the updated PR at af5209df68dc4af4e7b0a35a4214bfd54068fe6f and it is not merge-ready yet.
Findings:
-
docs/isa/system/ops/TFREE.md:41- [P1] TFREE documents a non-existent tile-operand APIThe page now documents
TFREE(PipeT &pipe, TileData &tile, ...), says the tile must come fromTPOP, and showsauto tile = TPOP(pipe); ... TFREE(pipe, tile);. That does not matchinclude/pto/common/pto_instr.hpp, where the public forms areTFREE(Pipe &pipe, ...)and the global-tensor overloads, with no tile handle operand. The backendTFREE_IMPLpaths are also no-op/placeholders in the current source. This would teach users to call an API that does not exist. -
docs/isa/tile/ops/irregular-and-complex/thistogram.md:39- [P1] THISTOGRAM omits the required byte template and index tileThe documented signature is
THISTOGRAM(dst, src, events...), but the actual public API istemplate <HistByte byte, typename TileDataDst, typename TileDataSrc, typename TileDataIdx, ...> THISTOGRAM(dst, src, idx, events...). The page also describessrcas directly producing bins, while the implementation and tests require a separateidxtile andHistByteselection. Please update syntax, inputs, constraints, and examples to include both. -
docs/isa/tile/ops/tile-scalar-and-immediate/taxpy.md:39anddocs/isa/tile/tile-scalar-and-immediate.md:10- [P1] TAXPY is documented as four operands but the API has threeThe PR describes
dst = src0 * scalar + src1and documentsTAXPY(dst, src0, scalar, src1). The source API isTAXPY(dst, src0, scalar, events...), dispatching toTAXPY_IMPL(dst, src0, scalar), and existing tests callTAXPY(dstTile, src0Tile, scalar). Please align the semantics and examples with the actual API or update the implementation separately. -
docs/isa/tile/ops/layout-and-rearrangement/tpack.md:37- [P1] TPACK is documented as declared inpto_instr.hpp, but no public declaration exists thereI could not find
TPACK/TPACK_IMPLininclude/pto/common/pto_instr.hppor backend headers. The page now claims a public intrinsic and gives source/destination packing rules, but the source tree does not expose that instruction contract. Either add the missing API in code or keep the docs constrained to what is actually implemented. -
Formatting gate:
git diff --check origin/main...origin/pr/108fails on many changed files withnew blank line at EOF, includingdocs/isa/tile/ops/irregular-and-complex/tci.md,tgather.md,tpart*.md,tquant.md, multipletfillpad*pages, and mostreduce-and-expandpages. Please remove those extra trailing blank lines.
Verification:
/Users/zhoubot/github/pto-isa/.venv-mkdocs/bin/python -m mkdocs build -f docs/mkdocs/mkdocs.yml --strictpassed on the PR worktree.git diff --check origin/main...origin/pr/108failed as noted above.- Source alignment checked against
include/pto/common/pto_instr.hpp,include/pto/cpu/THistogram.hpp,include/pto/npu/a5/THistogram.hpp, and the related CPU/NPU tests.
I’m not merging this PR until these documentation/source-contract mismatches and formatting issues are fixed.
|
Triage review (2026-05-08): this PR still needs work before it can be merged. GitHub marks it The scope is documentation-only but broad: 46 files across tile op pages and system/comm pages. Please address the requested review, run |
- Added performance metrics, cycle counts, and instruction sequences for `pto.tcolexpanddiv`, `pto.tcolexpandexpdif`, `pto.tcolexpandmax`, `pto.tcolexpandmin`, `pto.tcolexpandmul`, and `pto.tcolexpandsub`. - Updated related operations and instruction set links for better navigation. - Improved clarity on layout and shape impacts for various operations. - Expanded the `pto.taxpy` documentation to include detailed performance analysis, constraints, and examples.
…OP, TPUSH, THISTOGRAM, TPACK, and TAXPY
…F blanks - TFREE: drop nonexistent tile operand; signature is TFREE(pipe, events...) only. - THISTOGRAM: document the required HistByte template parameter and idx tile; signature is THISTOGRAM<byte>(dst, src, idx, events...). Mark A5/CPU-only. - TAXPY: signature is TAXPY(dst, src0, scalar, events...) with dst += src0 * scalar in place; remove fictional src1 operand. - TPACK: flag that no public TPACK intrinsic is declared in include/ today; keep the intended signature explicitly marked as not-yet-public. - Strip trailing blank lines from the doc pages that triggered 'git diff --check' EOF warnings (irregular-and-complex/*, layout-and-rearrangement/*, memory-and-data-movement/*, reduce-and-expand/*).
Per review on 2026-05-08: keep each instruction page aligned with the source
contract rather than carrying speculative cycle models. Every removed section
self-described its numbers as 'first-order estimates' or schematic O(...) order
counts, and none were backed by a measurement source that the page or PR could
cite. Once measured numbers from pto-isa/a2a3_benchmark.csv and
pto-isa/a5_benchmark.csv (or another named source) are available, performance
sections can be reintroduced page-by-page with the source called out.
Affected pages (45):
- comm/TGATHER, comm/TSCATTER
- system/ops/TFREE, TPOP, TPUSH
- tile/ops/elementwise-tile-tile/tcvt
- tile/ops/irregular-and-complex/{tci, tgather, tgatherb, thistogram,
tpartadd, tpartmax, tpartmin, tpartmul, tquant, tscatter, tsort32, ttri}
- tile/ops/layout-and-rearrangement/{textract, tfillpad, tfillpad-expand,
tfillpad-inplace, tinsert, tpack, ttrans}
- tile/ops/memory-and-data-movement/{mgather, mscatter, tload, tprefetch, tstore}
- tile/ops/reduce-and-expand/{tcolexpand*, trowexpand*}
- tile/ops/tile-scalar-and-immediate/taxpy
Verification:
- git diff --check upstream/main HEAD: clean (no whitespace/EOF warnings).
- pre-commit run on all PR-touched docs: end-of-file-fixer and
trailing-whitespace pass.
- python3 scripts/check_markdown_diagrams.py: 0 issues.
- python3 -m mkdocs build -f docs/mkdocs/mkdocs.yml --strict: builds clean.
pto.tcolexpanddiv,pto.tcolexpandexpdif,pto.tcolexpandmax,pto.tcolexpandmin,pto.tcolexpandmul, andpto.tcolexpandsub.pto.taxpydocumentation to include detailed performance analysis, constraints, and examples.