Fix torch.split fails in to_edge with alias annotations#18700
Fix torch.split fails in to_edge with alias annotations#18700GregoryComer merged 3 commits intopytorch:mainfrom
Conversation
Fixes pytorch#11723 _remove_invalid_ops_for_not_decompose relied on torchgen's aliased_return_names() to detect ops with aliased returns, but it returns [None] for ops returning lists of aliased tensors (e.g., split.Tensor returns Tensor(a)[]). This let split.Tensor through into the EDGE_DO_NOT_DECOMP namespace where functionalization failed. Add a fallback check using op._schema.returns directly, which correctly reports alias_info on list return types. This also fixes the same latent issue for chunk and tensor_split. Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18700
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures), 5 Unclassified FailuresAs of commit c0191a9 with merge base 6020c29 ( UNCLASSIFIED FAILURES - DrCI could not classify the following jobs because the workflow did not run on the merge base. The failures may be pre-existing on trunk or introduced by this PR:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
- Change 'may fail' to 'does not detect' (torchgen structurally cannot handle ListType alias annotations) - Add split_with_sizes.default to test to document overlap with blocklist Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
|
@pytorchbot label "release notes: exir" |
|
@JacobSzwejbka Do you know what the intent is for functionalization with ops like split or chunk that return a list of aliased tensors? Do we expect that they should be functionalized? This seems like a maybe a bug in functionalization if they're not. |
|
It should be converted to split_copy or further decomposed. Seems like the point of this change is that delegates are requesting to keep split from being decomposed and then the verifier complains that aliasing is in the graph. We should figure out how to let backends preserve aliasing ops (so they dont have to go defunctionalize them on their own) @GregoryComer Functionalization happens at the same time as decomposition. The changes in this diff are before that code runs. |
GregoryComer
left a comment
There was a problem hiding this comment.
Thanks. I think we can merge this as is. We can solve the larger question of preserving aliasing ops separately.
|
@Lidang-Jiang Can you resolve the lint failure? You can just add a suppression for the function too complex. We can go ahead and merge after that. Thanks! |
|
@GregoryComer Good catch, fixed in c0191a9. I added a function-scoped |
|
Thanks. I re-triggered CI. There is a temporary code freeze at the moment, but I should be able to merge in the next few days. |
|
Thanks for rerunning CI and for the update. |
Fixes #11723
Summary
torch.splitfails withRuntimeError: Found a custom (non-ATen) operator whose output has alias annotationswhen used withto_edge_transform_and_lowerand a partitioner that requests op preservation.Root cause:
_remove_invalid_ops_for_not_decomposerelies ontorchgen'saliased_return_names()to detect ops with aliased returns. However, for ops returning lists of aliased tensors (e.g.,split.TensorreturnsTensor(a)[]),aliased_return_names()returns[None], failing to detect the alias annotation. This letssplit.Tensorpass through into theEDGE_DO_NOT_DECOMPnamespace, where functionalization fails.Fix: Add a fallback check using
op._schema.returnsdirectly, which correctly reportsalias_infoon list return types. This also fixes the same latent issue forchunk.defaultandtensor_split.sections.Test plan
test_remove_invalid_ops_filters_aliased_list_returnsregression testpytest exir/tests/test_passes.py::TestPasses::test_remove_invalid_ops_filters_aliased_list_returns -xvstest_to_out_variant_singleon_tensor_listtest_compile_fix_broken_opsBefore fix
After fix
Unit test output
This PR was authored with the assistance of Claude.