-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unique op #1547
base: main
Are you sure you want to change the base?
Add unique op #1547
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1547 +/- ##
==========================================
- Coverage 77.56% 77.50% -0.07%
==========================================
Files 214 216 +2
Lines 23186 23381 +195
Branches 3975 4033 +58
==========================================
+ Hits 17984 18121 +137
- Misses 4433 4477 +44
- Partials 769 783 +14 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! Could you follow the CLA bot's instruction to get that cleared?
except Exception as e: | ||
# try to provide a more informative error message | ||
if _NOT_IMPLEMENTED_UNIQUE.search(str(e)) is not None: | ||
raise NotImplementedError( | ||
f"'onnxruntime' does not yet support Unique(11) operator with dtype={self.dtype}'" | ||
) from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this try-catch as the function here is symbolic; we don't expect them to raise any errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in b528a6a
Yea, I may have jumped the gun a bit. Working on officially getting permission from my employer. |
@microsoft-github-policy-service agree [company="Radiance Technologies"] @microsoft-github-policy-service agree company="Radiance Technologies" |
@microsoft-github-policy-service agree company="Radiance Technologies" |
@@ -438,6 +438,34 @@ def _where_input_wrangler( | |||
return args, kwargs | |||
|
|||
|
|||
def _unique_unsorted_xfail_matcher( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinchuby I'm not sure what the preferred behavior is here. Should we match torch.unique
and ignore the sorted
argument (i.e., always sort in aten_unique
) or respect the argument and deviate in accordance with this matcher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the argument leads to different behavior in cuda/cpu etc? I assume sorted=False means it can be sorted, but it doesn't need to be; and there are some potential performance gain by turning it off. If that's the interpretation I would keep the argument. Otherwise ignoring the argument and matching behavior would also be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am investigating differences in behavior between cuda/cpu and have found at least one already (unique_dim
on CPU ignores the return_inverse
and return_counts
arguments whereas the CUDA impl does not). How should these differences be handled? Can the op registration be conditioned by the device somehow, or should I favor CUDA over CPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matching CUDA for now is preferable. Thanks!
453783f
to
b528a6a
Compare
# HACK: force indices to be in the graph so that it gets a name during optimization | ||
# Otherwise an error will be raised in `onnxscript.Scope.lookup_or_create` | ||
indices_size = op.Shape(indices) | ||
counts = op.Reshape(counts, indices_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to note that the way that this function was written in 1d74d59 is functionally equivalent but yields an error in onnxscript.Scope.lookup_or_create
because it causes modified
to be True in onnxscript.optimizer.optimize
, thus causing a second loop of optimization that crashes in the first call to inline_simple_functions
.
This seems indicative of a potential bug to me, but I am not knowledgeable enough about the codebase to suggest a cause or fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @gramalingam
Thanks for completing the CLA. I will take a look next week |
result = unique_values, counts | ||
else: | ||
result = unique_values | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to always return the same number of values. Consider returning None when they are not available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing so deviates from the behavior of torch.unique
and causes this assertion in the unit tests to fail:
assert len(flattened_torch_outputs) == len(flattened_function_outputs) |
Please advise on how to address this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does torch.ops.aten.unique
exhibit the same behavior? If it always returns three variables, consider creating a new OpInfo for torch.ops.aten.unique
similar to
onnxscript/tests/function_libs/torch_lib/extra_opinfo.py
Lines 2105 to 2114 in a5ed079
opinfo_core.OpInfo( | |
"ops.aten._native_batch_norm_legit.no_stats", | |
aten_name="_native_batch_norm_legit.no_stats", | |
dtypes=common_dtype.floating_types_and(torch.bfloat16), | |
dtypesIfCUDA=common_dtype.floating_types_and(torch.float16, torch.bfloat16), | |
supports_forward_ad=True, | |
supports_fwgrad_bwgrad=True, | |
assert_jit_shape_analysis=True, | |
sample_inputs_func=sample_inputs__native_batch_norm_legit_no_stats, | |
), |
You may adapt the sample function from https://github.com/pytorch/pytorch/blob/b948b1ad7a9cf61c9692506c60c295fd40e00f43/torch/testing/_internal/common_methods_invocations.py#L3346-L3372
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer to extra_opinfo
. It turns out torch.ops.aten.unique
does not exist, but torch.ops.aten._unique
does. Added OpInfo
for it, _unique2
, and unique_dim
in 14d03b5
"""unique(Tensor self, bool sorted=True, bool return_inverse=False, bool return_counts=False) -> (Tensor, Tensor, Tensor)""" | ||
|
||
unique_values, indices, inverse_indices, counts = op.Unique(self, axis=None, sorted=sorted) | ||
# HACK: force indices to be in the graph so that it gets a name during optimization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a bug we should fix elsewhere? saw comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could possibly be considered a different bug. The other one is a side-effect of onnxscript.optimizer.constant_folding.fold_constants
, whereas this one is a side-effect of the function linked below, which converts the names of unused outputs to empty strings but only removes them if they are trailing. Since inverse_indices
and counts
are used, it leads to an error being raised in onnxscript.Scope.lookup_or_create
due to the empty string name given to indices
.
def remove_unused_optional_outputs( |
56c06cf
to
7e6d906
Compare
@@ -8380,8 +8380,21 @@ def aten__unique( | |||
) -> tuple[TensorType, TensorType]: | |||
"""_unique(Tensor self, bool sorted=True, bool return_inverse=False) -> (Tensor, Tensor)""" | |||
|
|||
unique_values, _, inverse_indices, _ = op.Unique(self, axis=None, sorted=True) | |||
unique_values, indices, inverse_indices, _ = op.Unique(self, axis=None, sorted=True) | |||
# HACK: force indices to be in the graph so that it gets a name during optimization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest removing all hacks. I will go fix what's necessary where the bug is. We are also moving to prefer trace_only=True
for new functions so if you can include the flag in @torch_op
that would be awesome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be awesome. The hacks are definitely getting out of hand. I'll wait for that fix so that I can continue to test with this locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a short script handy that will reproduce the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if __name__ == '__main__':
import logging
import torch
import numpy as np
import onnx
import onnxruntime as ort
for i in range(16):
sorted = bool(i & 1)
return_inverse = bool((i & 2) > 1)
return_counts = bool((i & 4) > 1)
dim = 0 if bool((i & 8) > 1) else None
print(
f"Testing sorted={sorted}, return_inverse={return_inverse}, return_counts={return_counts}, dim={dim}"
)
def test_function(
x: torch.Tensor,
s: bool = sorted,
ri: bool = return_inverse,
rc: bool = return_counts,
d: int | None = dim) -> Any:
result = torch.unique(
x,
sorted=s,
return_inverse=ri,
return_counts=rc,
dim=d)
return result
onnx_program = torch.onnx.dynamo_export(
test_function,
torch.arange(10),
export_options=torch.onnx.ExportOptions(
dynamic_shapes=True,
diagnostic_options=torch.onnx.DiagnosticOptions(
verbosity_level=logging.DEBUG)))
onnx_program.save("torch_unique.onnx")
onnx_inputs = onnx_program.adapt_torch_inputs_to_onnx(torch.arange(10))
onnx_outputs = onnx_program(*onnx_inputs)
loaded_onnx_program = onnx.load("torch_unique.onnx")
onnx.checker.check_model(loaded_onnx_program)
ort_session = ort.InferenceSession("torch_unique.onnx")
inputs = np.random.randint(0, 10, 10)
print(f"Inputs: {inputs}")
outputs = ort_session.run(None,
{"l_x_": inputs})
print(f"Outputs: {outputs}")
print("Success")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you should also test using the nightly release of PyTorch with the changes in pytorch/pytorch#126561.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is trace_only=True
expected to require significant changes to the way one implements an op? It appears that enabling the flag breaks passing a value to op.ConstantOfShape
and also breaks indexing a shape.
For example, op.ConstantOfShape([0], value=[0])
must become op.Cast(op.ConstantOfShape([0]), to=INT64.dtype)
, and output_size[dim]
must become op.Slice(output_size, [dim], [dim+1])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your observation is correct. This may be the case because the gaps in implementation we have. Bridging the gaps is in our roadmap but is not the highest priority for the team.
Follow-up to #113118 and #124306. Developed in coordination with the solution to microsoft/onnxscript#1547 This PR adds the missing fake tensor implementation for `aten.unique_dim`, thus enabling tracing and compilation of `torch.unique` when `dim` is not None. Local testing has proceeded with the following simple script (provided that one has checked out the changes in microsoft/onnxscript#1547): ```python import onnx import onnxruntime as ort import logging import numpy as np onnx_program = torch.onnx.dynamo_export( lambda x: torch.unique(x, dim=0, return_inverse=True), torch.arange(10), export_options=torch.onnx.ExportOptions( dynamic_shapes=True, diagnostic_options=torch.onnx.DiagnosticOptions( verbosity_level=logging.DEBUG))) onnx_program.save("torch_unique.onnx") onnx_inputs = onnx_program.adapt_torch_inputs_to_onnx(torch.arange(10)) onnx_outputs = onnx_program(*onnx_inputs) loaded_onnx_program = onnx.load("torch_unique.onnx") onnx.checker.check_model(loaded_onnx_program) ort_session = ort.InferenceSession("torch_unique.onnx") inputs = np.random.randint(0, 10, 10) print(f"Inputs: {inputs}") outputs = ort_session.run(None, { "l_x_": inputs }) print(f"Outputs: {outputs}") print("Success") ``` Co-authored-by: Edward Z. Yang <[email protected]> Pull Request resolved: #126561 Approved by: https://github.com/ezyang
Follow-up to pytorch#113118 and pytorch#124306. Developed in coordination with the solution to microsoft/onnxscript#1547 This PR adds the missing fake tensor implementation for `aten.unique_dim`, thus enabling tracing and compilation of `torch.unique` when `dim` is not None. Local testing has proceeded with the following simple script (provided that one has checked out the changes in microsoft/onnxscript#1547): ```python import onnx import onnxruntime as ort import logging import numpy as np onnx_program = torch.onnx.dynamo_export( lambda x: torch.unique(x, dim=0, return_inverse=True), torch.arange(10), export_options=torch.onnx.ExportOptions( dynamic_shapes=True, diagnostic_options=torch.onnx.DiagnosticOptions( verbosity_level=logging.DEBUG))) onnx_program.save("torch_unique.onnx") onnx_inputs = onnx_program.adapt_torch_inputs_to_onnx(torch.arange(10)) onnx_outputs = onnx_program(*onnx_inputs) loaded_onnx_program = onnx.load("torch_unique.onnx") onnx.checker.check_model(loaded_onnx_program) ort_session = ort.InferenceSession("torch_unique.onnx") inputs = np.random.randint(0, 10, 10) print(f"Inputs: {inputs}") outputs = ort_session.run(None, { "l_x_": inputs }) print(f"Outputs: {outputs}") print("Success") ``` Co-authored-by: Edward Z. Yang <[email protected]> Pull Request resolved: pytorch#126561 Approved by: https://github.com/ezyang
Circling back around to this @justinchuby. At the time, I had been waiting for you to resolve the bug that required a hacky workaround, but I realize that might not be clear. There were also a couple of other potential unresolved bugs outside the scope of this PR, e.g., this comment. How would you like to proceed? |
Sorry for missing the clarity. I would suggest that you remove all the hacks so that the code is at its desirable state. If tests fail because of that, that’s ok. I will then go ahead to fix what’s needed. (After I’m back from vacation) |
Sounds good. FYI, the hacks were removed in b8b4cb1. As a reminder, the unit tests within If I get a chance, I'll try to rebase this PR and resolve conflicts first. |
I am implementing a CTC decoder class in pytorch -
I'm not able to export this class to onnx. Here is my error -
How can I resolve this? |
Add support for exporting
torch.unique
following the conclusion of pytorch/pytorch#113118.