-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checks for inconsistent device types #272
Conversation
50125ff
to
f02e334
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jwallwork23 I just rebased this off of current main and then modified an example to check that if we try and add two with different devices we get an error raised as expected.
I used the below:
real(wp), dimension(5), target :: a, b, combine_data
type(torch_tensor) :: tensor_a, tensor_b, combine_tensor
a = [0.0_wp, 1.0_wp, 2.0_wp, 3.0_wp, 4.0_wp]
b = [0.0_wp, 1.0_wp, 2.0_wp, 3.0_wp, 4.0_wp]
call torch_tensor_from_array(tensor_a, a, tensor_layout, torch_kCPU)
call torch_tensor_from_array(tensor_b, b, tensor_layout, torch_kMPS)
combine_tensor = tensor_a + tensor_b
call torch_tensor_print(combine_tensor)
When tensors a and b are on CPU this passes fine, but in the above state (b on MPS) we get:
Error :: cannot add tensors with different device types
Note: The following floating-point exceptions are signalling: IEEE_OVERFLOW_FLAG
STOP 1
Since we looked together in-person I am happy to approve this now.
For posterity we also discussed that this error warning could be made friendlier (parent routine, line number etc) but this is not a simple task and not to be done here.
I don't think it currently supports this - as far as I can tell, it can only mark certain tests as expected to fail, not interrogate error messages. |
f02e334
to
4cfa4ab
Compare
[Rebased on top of main, fixing various conflicts] |
* Drop torch_tensor_to_<array/blob> * Remove torch_tensor_to_array from autograd example * Use std::move for assignment * Update constructors/destructors tests to avoid to_array * Update summary of autograd example * Update overloads tests to avoid to_array * Fix previous API change note
I think I hadn't re-approved this because of the issues I had seen when trying to print a tensor. Did we get to be bottom of that? I know it was mentioned in the meeting this morning. Looking at the simplenet that failed that I uploaded I wonder if it was to do with a lack of importing the overloaded |
OK, got nerd sniped. The segfault was resolved by ensuring So that is good news. The bad news I guess is that this can be a tricksy bug/error to spot/resolve as there isn't much to go on from the FTorch output. |
Ah yes! Good work tracking it down. Opened #305 to address this. |
Closes #269.
This PR adds checks that overloaded operators are only applied to tensors that have the same device type.
It's sufficient to add the checks in the assignment operator and the binary operators as the unary operators will go through the assignment anyway.
The PR also adds checks that tensors have been constructed before they are interrogated. Related fix in the autograd example - the output tensor
Q
wasn't set up so didn't have a device type yet.Questions: