[AscendNPU-IR][A5] surpport for cumsum interleave and deinterleave#905
[AscendNPU-IR][A5] surpport for cumsum interleave and deinterleave#905james1314520 wants to merge 2 commits intotile-ai:npuir-devfrom
Conversation
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
There was a problem hiding this comment.
Code Review
This pull request migrates the Cumsum, Interleave, and Deinterleave operations from the hivm dialect to the hfusion dialect in the NPU codegen and adds associated unit tests. However, the implementation of VinterleaveCodegen and VdeinterleaveCodegen contains critical issues: they use memref types where tensor types are expected and incorrectly overwrite destination buffers, leading to potential data loss when dealing with subregions. Additionally, the DeinterleaveOp is missing the necessary channel_nums attribute.
| Value dst = GenSubviewFromRegion(npuirop.dst, npuirop.dst_range); | ||
| builder.create<mlir::hivm::VInterleaveOp>( | ||
| builder.getUnknownLoc(), TypeRange{}, srcs_vr, dst, | ||
| static_cast<int64_t>(npuirop.channel_nums)); | ||
| mlir::Type outputType = dst.getType(); | ||
|
|
||
| auto interleaveOp = builder.create<mlir::hfusion::InterleaveOp>( | ||
| builder.getUnknownLoc(), | ||
| outputType, | ||
| srcs_vr); | ||
| SetVarValue(npuirop.dst, interleaveOp->getResult(0)); |
There was a problem hiding this comment.
The current implementation of VinterleaveCodegen has several issues:
- Type Mismatch: It uses
GenSubviewFromRegion, which returns amemref, buthfusionoperations and theSetVarValuemapping system in this codegen are designed fortensors. You should useGenExtractSliceFromRegioninstead. - Partial Update Bug: By calling
SetVarValuedirectly with the result of the interleave operation, you are overwriting the entire destination buffer with a tensor that only represents the subregion (ifdst_rangeis not the full buffer). This will cause data loss for the rest of the buffer. You must useReshapeCastAndInsertSliceto correctly update the destination tensor. - Source Types: Note that the
srcsvector (populated in the context above) also containsmemrefsfromGenSubviewFromRegion. These should ideally be converted to tensors as well for compatibility withhfusionops.
mlir::Value dst_tensor = GenExtractSliceFromRegion(npuirop.dst, npuirop.dst_range);
auto interleaveOp = builder.create<mlir::hfusion::InterleaveOp>(
builder.getUnknownLoc(),
dst_tensor.getType(),
srcs_vr);
mlir::Value result = ReshapeCastAndInsertSlice(interleaveOp->getResult(0), GetVarValue(npuirop.dst), npuirop.dst_range);
SetVarValue(npuirop.dst, result);There was a problem hiding this comment.
These are important to address.
| Value dst = GenSubviewFromRegion(npuirop.dsts[i], npuirop.dsts_range[i]); | ||
| dsts.push_back(dst); | ||
| } | ||
| mlir::ValueRange dsts_vr(dsts); | ||
| auto channel_nums = mlir::IntegerAttr::get( | ||
| builder.getI64Type(), static_cast<int64_t>(npuirop.channel_nums)); | ||
| mlir::hivm::DeinterleaveModeAttr index_mode = | ||
| mlir::hivm::DeinterleaveModeAttr::get( | ||
| &context, NPUIR_STR_DEINTERLEAVEMODE[npuirop.index_mode]); | ||
| builder.create<mlir::hivm::VDeinterleaveOp>(builder.getUnknownLoc(), | ||
| TypeRange{}, src, dsts_vr, | ||
| channel_nums, index_mode); | ||
|
|
||
| int64_t current_channel_idx = i; | ||
| auto channelIdxAttr = builder.getI64IntegerAttr(current_channel_idx); | ||
|
|
||
| auto deinterleaveOp = builder.create<hfusion::DeinterleaveOp>( | ||
| builder.getUnknownLoc(), | ||
| dst.getType(), | ||
| src, | ||
| channelIdxAttr | ||
| ); | ||
| SetVarValue(npuirop.dsts[i], deinterleaveOp->getResult(0)); |
There was a problem hiding this comment.
The VdeinterleaveCodegen implementation contains critical issues similar to VinterleaveCodegen:
- Data Loss:
SetVarValueoverwrites the full destination buffer with a subregion-sized tensor. UseReshapeCastAndInsertSliceto perform a propertensor.insert_sliceinto the destination. - Type Mismatch: Use
GenExtractSliceFromRegionto obtain atensorinstead of amemreffor the destination type. - Missing Attribute: The
DeinterleaveOpappears to be missing the total number of channels (npuirop.channel_nums). Without this, the operation cannot determine the correct stride for deinterleaving. - Namespace Consistency: Use
mlir::hfusion::DeinterleaveOpto match the style used in other parts of the file.
mlir::Value dst_tensor = GenExtractSliceFromRegion(npuirop.dsts[i], npuirop.dsts_range[i]);
int64_t current_channel_idx = i;
auto channelIdxAttr = builder.getI64IntegerAttr(current_channel_idx);
auto channelNumsAttr = builder.getI64IntegerAttr(npuirop.channel_nums);
auto deinterleaveOp = builder.create<mlir::hfusion::DeinterleaveOp>(
builder.getUnknownLoc(),
dst_tensor.getType(),
src,
channelIdxAttr,
channelNumsAttr
);
mlir::Value result = ReshapeCastAndInsertSlice(deinterleaveOp->getResult(0), GetVarValue(npuirop.dsts[i]), npuirop.dsts_range[i]);
SetVarValue(npuirop.dsts[i], result);894f5e8 to
a03693c
Compare
support for cumsum interleave and deinterleave