Fix subview aliasing for MTE2 sync insertion#669
Conversation
There was a problem hiding this comment.
Code Review
This pull request enhances the memory alias analysis by introducing StaticMemRegion to track buffer offsets, sizes, and strides. This allows the InsertSync pass to identify disjoint subviews of the same root buffer, thereby eliminating unnecessary synchronization barriers. The implementation includes the extraction of region data from MLIR operations, integration into the MemoryDependentAnalyzer, and expanded debug tracing. Feedback points out that the logic for composing nested subviews is currently flawed, as it fails to correctly incorporate parent strides into offset calculations and incorrectly propagates strides and effective region sizes.
| for (size_t i = 0; i < region.offsets.size(); ++i) | ||
| region.offsets[i] += parentRegion->offsets[i]; |
There was a problem hiding this comment.
The composition of subview offsets does not account for the strides of the parent region. When nesting subviews, the absolute offset in the root coordinate system should be calculated as parent_offset + child_offset * parent_stride. The current implementation assumes unit strides, which may lead to incorrect aliasing results for strided subviews.
| for (size_t i = 0; i < region.offsets.size(); ++i) | |
| region.offsets[i] += parentRegion->offsets[i]; | |
| for (size_t i = 0; i < region.offsets.size(); ++i) | |
| region.offsets[i] = parentRegion->offsets[i] + region.offsets[i] * parentRegion->strides[i]; |
| if (parentRegion->strides.size() == region.strides.size()) | ||
| region.strides = parentRegion->strides; |
There was a problem hiding this comment.
Overwriting the subview's strides with the parent's strides is incorrect if the subview itself introduces a non-unit stride (e.g., via staticStrides in memref.subview). The resulting stride should be the product of the parent's stride and the subview's relative stride. Additionally, the effective size of the region in the root coordinate system should be adjusted to (child_size - 1) * parent_stride + 1 to correctly represent the bounding box in the root buffer when strides are involved.
Codex Review该评论由 review 机器人自动更新。
SummaryReview failed at stage Findings未生成结构化 findings,因为 review 过程提前失败。 Log Tail |
2af1b38 to
db01401
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db01401631
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| auto srcType = dyn_cast<MemRefType>(src.getType()); | ||
| if (!srcType) return {0, 0}; | ||
| if (!srcType) | ||
| return aliasInfo; |
There was a problem hiding this comment.
Handle PTO subview offsets before returning on non-memrefs
For pto.subview/PartitionViewOp the source is a TileBufType, so this early MemRefType check returns the default alias info before any offsets or sizes are computed. Because UpdateAliasBufferInfo still records every PTO subview with the parent's base/size, disjoint slices of the same tile all appear to overlap and same-pipe MTE2 barriers are inserted; the new issue667_disjoint_subview_tloads case exercises exactly this path, so the intended PTO subview alias fix is not actually applied to PTO tile subviews.
Useful? React with 👍 / 👎.
No description provided.