Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/en/dev/passes/26-materialize_tensor_strides.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Codegen needs one machine-readable contract, so `MaterializeTensorStrides` walks

**Produces**:

- `TensorViewCanonical` — `PassPipeline` auto-verifies after the pass (using the registry's weak-mode verifier)
- `TensorViewCanonical` — `PassPipeline` auto-verifies after the pass using the registry's **strict-mode** verifier (empty stride on a present `TensorView` is rejected — that is the state this pass is responsible for eliminating)

**Position in the default pipeline** (active since RFC #1300 P6): between [`CanonicalizeIOOrder`](25-canonicalize_io_order.md) and [`InitMemRef`](27-init_memref.md). This is the codegen-prep boundary — every layout-mutating pass (`LowerTransposeLoadParamLayout`, `ResolveBackendOpLayouts`, `ExpandMixedKernel`, `SplitVectorKernel`) has finished, and `InitMemRef` is the first consumer that needs explicit stride.

Expand Down Expand Up @@ -106,7 +106,7 @@ See `BuildLogicalStridesFromLayout` in [`tensor_view_semantics.h`](../../../../i

## Verifier interaction

Because the pass declares `produced = {... ∪ TensorViewCanonical}`, `PassPipeline` automatically runs the registry's `TensorViewCanonical` verifier after the pass, surfacing invalid IR (e.g. NZ-on-`TensorType`) immediately as `pypto::ValueError`. The registry default is the **weak-mode** verifier (which accepts `stride.empty()` as implicitly packed canonical); the **strict-mode** verifierwhich requires materialization — is reachable directly via `passes.verify_tensor_view_canonical(program, require_materialized=True)` and is the codegen-entry contract that P6/P7 will enforce.
Because the pass declares `produced = {... ∪ TensorViewCanonical}`, `PassPipeline` automatically runs the registry's `TensorViewCanonical` verifier after the pass. The registry default is the **strict-mode** verifier (RFC #1300 §2.4 codegen-entry contract): it rejects `view.has_value() && stride.empty()` since this pass is responsible for materializing those slots. Bare `TensorType` (`!view.has_value()`) is still acceptedimplicit ND-packed is canonical by construction. The same verifier is callable directly via `passes.verify_tensor_view_canonical(program, require_materialized=True)`; pass `require_materialized=False` for the weak mode used during the parse-time / early-pass window before materialization runs.

## Related

Expand Down
35 changes: 27 additions & 8 deletions docs/en/user/01-language_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,19 +48,38 @@ idx: pl.Scalar[pl.INDEX] # index scalar

### Tensor Layouts

Layouts control the physical memory arrangement of Tensors:
Write your `pl.Tensor[...]` annotations using the **runtime row-major
shape** without a layout marker. Layout is an IR-internal concern that
passes derive from the ops actually producing/consuming views; you do
not need to express it in the type annotation.

| Layout | Description |
| ------ | ----------- |
| `pl.ND` | N-Dimensional (default, row-major) |
| `pl.DN` | DN layout |
| `pl.NZ` | NZ fractal format (hardware-specific tiling) |
```python
# ✅ Recommended — source tensor shape, no layout marker:
b: pl.Tensor[[N, K], pl.FP32]
```

```python
# Specify layout as third type parameter
a: pl.Tensor[[64, 128], pl.FP16, pl.NZ]
# ⚠️ Deprecated (RFC #1300 supplementary 1):
b: pl.Tensor[[K, N], pl.FP32, pl.DN] # → DeprecationWarning at parse time
```

> **Why `pl.Tensor[..., pl.DN]` is deprecated.** Writing the DN
> layout-only shorthand forces you to mentally hold two coordinate systems
> at once (the IR-logical post-view shape and the runtime row-major shape).
> Drop the layout marker and write the runtime shape — for matmul B^T,
> use `pl.load(..., transpose=True)` on the row-major tensor (see "Data
> Movement" below); for slicing a DN-producing op, the slice inherits
> the parent's layout automatically.

For NZ (hardware-specific tile layout), use `pl.Tile[..., pl.NZ]` — NZ is
tile-only, never a TensorType annotation. The `pl.NZ` constant remains
available for tile annotations and IR-internal use.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

If you need to write a DN tensor at the IR level (e.g. when constructing
fixtures or round-tripping printed IR), prefer
`pl.TensorView(stride=[...], layout=pl.TensorLayout.DN)` which forces
explicit stride and avoids the implicit coordinate-flip hazard.

### Dynamic Shapes

Use `pl.dynamic()` for dimensions determined at runtime:
Expand Down
4 changes: 2 additions & 2 deletions docs/zh-cn/dev/passes/26-materialize_tensor_strides.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ PyPTO IR 上 `TensorType.tensor_view_` 当前可以处于两种等价形态:

**Produces**:

- `TensorViewCanonical` —— `PassPipeline` 在 Pass 之后自动用 registry 中的弱模式 verifier 校验
- `TensorViewCanonical` —— `PassPipeline` 在 Pass 之后自动用 registry 中的**严格模式** verifier 校验(拒绝 `view.has_value() && stride.empty()` —— 正是本 Pass 负责消除的状态)

**默认 pipeline 中的位置**(自 RFC #1300 P6 起激活):[`CanonicalizeIOOrder`](25-canonicalize_io_order.md) 与 [`InitMemRef`](27-init_memref.md) 之间。这是 codegen-prep 边界 —— 所有 layout-mutating pass(`LowerTransposeLoadParamLayout` / `ResolveBackendOpLayouts` / `ExpandMixedKernel` / `SplitVectorKernel`)已结束,`InitMemRef` 是第一个依赖显式 stride 的消费者。

Expand Down Expand Up @@ -106,7 +106,7 @@ ND 情况下公式退化为标准行主序 packed stride。

## 与 verifier 的协同

由于 Pass 声明 `produced = {... ∪ TensorViewCanonical}`,`PassPipeline` 在 Pass 完成后自动调用 registry 中的 `TensorViewCanonical` verifier;非法 IR(如 `TensorType` 上挂 NZ)会立即作为 `pypto::ValueError` 抛出。registry 默认是**弱模式** verifier(接受 `stride.empty()`);**严格模式** verifier 通过 `passes.verify_tensor_view_canonical(program, require_materialized=True)` 显式调用,它就是 P6/P7 将启用的 codegen 入口契约
由于 Pass 声明 `produced = {... ∪ TensorViewCanonical}`,`PassPipeline` 在 Pass 完成后自动调用 registry 中的 `TensorViewCanonical` verifier。registry 默认是**严格模式** verifier(RFC #1300 §2.4 codegen 入口契约):它拒绝 `view.has_value() && stride.empty()` —— 因为本 Pass 就是负责物化这些 stride 的。裸 `TensorType`(`!view.has_value()`)仍然接受 —— 隐式 ND-packed 自然 canonical。同一 verifier 也可通过 `passes.verify_tensor_view_canonical(program, require_materialized=True)` 显式调用;传 `require_materialized=False` 切换到弱模式(用于物化之前的解析期 / 前期 pass 窗口)

## 相关

Expand Down
21 changes: 13 additions & 8 deletions docs/zh-cn/user/01-language_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,19 +48,24 @@ idx: pl.Scalar[pl.INDEX] # 索引标量

### 张量布局(TensorLayout)

布局控制 Tensor 的物理内存排列:
`pl.Tensor[...]` annotation 写 **runtime 行优先 shape**,不写 layout 标记。layout 是 IR 内部概念,由派生/消费视图的 op 推导,不需要在 annotation 上表达。

| 布局 | 说明 |
| ---- | ---- |
| `pl.ND` | N 维(默认,行优先) |
| `pl.DN` | DN 布局 |
| `pl.NZ` | NZ 分形格式(硬件特定分块) |
```python
# ✅ 推荐 —— 写源 tensor shape,不写 layout 标记:
b: pl.Tensor[[N, K], pl.FP32]
```

```python
# 指定布局作为第三个类型参数
a: pl.Tensor[[64, 128], pl.FP16, pl.NZ]
# ⚠️ 已弃用(RFC #1300 补充 1):
b: pl.Tensor[[K, N], pl.FP32, pl.DN] # → 解析期触发 DeprecationWarning
```

> **为什么弃用 `pl.Tensor[..., pl.DN]`。** layout-only 简写迫使用户脑子里同时持有两套坐标系(IR 逻辑后视图 shape 与 runtime 行优先 shape)—— 恰恰是 RFC #1300 想要消除的歧义。改用:去掉 layout 标记,写 runtime shape —— matmul B^T 场景用 `pl.load(..., transpose=True)` 加载行优先 tensor(参见下文「数据搬运」);DN-producing op 之后的 slice 自动继承父 layout。

如需 NZ(硬件 tile layout),写 `pl.Tile[..., pl.NZ]` —— NZ 是 tile-only,不允许作为 TensorType annotation。`pl.NZ` 常量保留用于 tile annotation 和 IR 内部使用。

Comment thread
coderabbitai[bot] marked this conversation as resolved.
若需要在 IR 层面写 DN tensor(如测试 fixture 或 round-trip 打印的 IR),用 `pl.TensorView(stride=[...], layout=pl.TensorLayout.DN)` —— 强制写显式 stride,避免隐式坐标翻转的隐患。

### 动态形状(Dynamic Shapes)

使用 `pl.dynamic()` 声明运行时确定的维度:
Expand Down
32 changes: 32 additions & 0 deletions python/pypto/language/parser/type_resolver.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"""Type annotation resolution for IR parsing."""

import ast
import warnings
from collections.abc import Callable, Sequence
from typing import TYPE_CHECKING, Any, cast

Expand Down Expand Up @@ -441,6 +442,7 @@ def _resolve_subscript_type(self, subscript_node: ast.Subscript) -> ir.Type: #
tensor_view = self._resolve_tensorview(third)
return tensor_ctor(shape, dtype, None, tensor_view)
layout = self.resolve_layout(third)
self._warn_on_user_facing_dn_layout(layout, type_name)
tensor_view = ir.TensorView([], layout)
return tensor_ctor(shape, dtype, None, tensor_view)

Expand All @@ -450,6 +452,7 @@ def _resolve_subscript_type(self, subscript_node: ast.Subscript) -> ir.Type: #
tensor_view = self._resolve_tensorview(third)
else:
layout = self.resolve_layout(third)
self._warn_on_user_facing_dn_layout(layout, type_name)
tensor_view = ir.TensorView([], layout)
memref_node = slice_value.elts[3]
if not self._is_memref_node(memref_node):
Expand Down Expand Up @@ -986,6 +989,35 @@ def resolve_dtype(self, dtype_node: ast.expr) -> DataType:
hint="Use pl.FP32, pl.INT32, or other supported dtype constants",
)

def _warn_on_user_facing_dn_layout(self, layout: "ir.TensorLayout", type_name: str) -> None:
"""Emit a ``DeprecationWarning`` when the user writes the layout-only DN
shorthand on a tensor type annotation (RFC #1300 supplementary 1).

Suppressed for ``ir.TensorLayout.ND`` (default, no-op marker) and for
explicit ``pl.TensorView(stride=..., layout=DN)`` forms (which carry
their own stride and don't rely on the shorthand's implicit coordinate
flip). Tile-side layouts are never seen here — Tile annotations route
through ``_resolve_tile_annotation_args``.
"""
if layout != ir.TensorLayout.DN:
return
warnings.warn(
f"pl.{type_name}[..., pl.DN] is deprecated (RFC #1300 supplementary 1). "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The warning message uses pl.{type_name} as a prefix. While this is correct for Tensor, DistributedTensor is typically imported from the pypto.language.distributed namespace (often aliased as pld). If a user uses pld.DistributedTensor[..., pl.DN], the warning pl.DistributedTensor[...] might be slightly confusing. Consider making the prefix more generic or detecting the actual namespace if possible, though pl. is a reasonable default for the project.

"Writing the DN layout-only shorthand requires the user to mentally hold "
"two coordinate systems at once (IR-logical post-view vs. runtime "
"row-major), which is exactly the ambiguity RFC #1300 aims to eliminate. "
"Three migration patterns cover every DN scenario without writing pl.DN:\n"
" * source tensor shape, no layout marker: pl.Tensor[[N, K], pl.FP32]\n"
" * derive DN at use site: xt = pl.transpose(x, -2, -1) # ND -> DN\n"
" * inherit DN through slice/reshape from a DN-producing op\n"
"If you must express a strided-DN view (e.g. canonical pretty-print "
"round-trip), use pl.TensorView(stride=[...], layout=pl.TensorLayout.DN) "
"instead — it forces explicit stride and avoids the implicit-coord-flip "
"hazard.",
DeprecationWarning,
stacklevel=4,
)

def resolve_layout(self, layout_node: ast.expr) -> "ir.TensorLayout":
"""Resolve layout annotation to ir.TensorLayout.

Expand Down
14 changes: 10 additions & 4 deletions src/ir/verifier/property_verifier_registry.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,17 @@ PropertyVerifierRegistry::PropertyVerifierRegistry() {
Register(IRProperty::InlineFunctionsEliminated, CreateInlineFunctionsEliminatedPropertyVerifier);
Register(IRProperty::OrchestrationReferencesResolved,
CreateOrchestrationReferencesResolvedPropertyVerifier);
// TensorViewCanonical (RFC #1300): the registry returns the weak-mode
// verifier (stride.empty() accepted as implicitly packed canonical).
// P3's MaterializeTensorStrides constructs the strict variant directly.
// TensorViewCanonical (RFC #1300 §2.4): strict mode — every TensorView
// reaching the codegen-entry boundary must carry explicit stride. The
// registry default fires immediately after ``MaterializeTensorStrides``
// (its produced property), turning the "codegen entry has explicit
// stride" contract from convention into a verified invariant. Bare
// TensorTypes (``!view.has_value()``) are still accepted as implicitly
// ND-packed — the check only flags ``view.has_value() && stride.empty()``,
// which is the state ``MaterializeTensorStrides`` is responsible for
// eliminating.
Register(IRProperty::TensorViewCanonical,
[]() { return CreateTensorViewCanonicalPropertyVerifier(/*require_materialized=*/false); });
[]() { return CreateTensorViewCanonicalPropertyVerifier(/*require_materialized=*/true); });
}

void PropertyVerifierRegistry::Register(IRProperty prop, std::function<PropertyVerifierPtr()> factory) {
Expand Down
25 changes: 22 additions & 3 deletions tests/ut/ir/transforms/test_verify_tensor_view_canonical.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,13 +211,32 @@ def test_symbolic_dn_relaxed_passes():
# ============================================================================


def test_registry_returns_weak_verifier():
"""The registry's TensorViewCanonical entry uses weak mode by default —
so empty stride is accepted (mirrors weak mode of verify_tensor_view_canonical)."""
def test_registry_returns_strict_verifier():
"""The registry's TensorViewCanonical entry uses strict mode (RFC #1300
§2.4 — codegen-entry contract). MaterializeTensorStrides produces this
property, so the auto-verify after it enforces explicit stride. Empty
stride on an explicit TensorView is rejected (the state
MaterializeTensorStrides is responsible for eliminating)."""
view = ir.TensorView([], ir.TensorLayout.DN)
t = ir.TensorType(_shape(4, 8), DataType.FP32, None, view)
program = _program_with_param_type(t)

props = _passes.IRPropertySet()
props.insert(_passes.IRProperty.TensorViewCanonical)
diags = _passes.PropertyVerifierRegistry.verify(props, program)
assert len(diags) >= 1
assert any("stride is empty" in d.message for d in diags), (
f"expected 'stride is empty' diagnostic, got: {[d.message for d in diags]}"
)


def test_registry_accepts_bare_tensor_type():
"""Bare TensorTypes (``!view.has_value()``) are implicitly ND-packed and
accepted by both weak and strict modes — only ``view.has_value() &&
stride.empty()`` is flagged."""
t = ir.TensorType(_shape(4, 8), DataType.FP32)
program = _program_with_param_type(t)

props = _passes.IRPropertySet()
props.insert(_passes.IRProperty.TensorViewCanonical)
diags = _passes.PropertyVerifierRegistry.verify(props, program)
Expand Down
Loading