-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding order
argument to asarray
#571
Comments
I think we purposely leave strides (and thus order) out of the standard because it's considered too low-level. However this information is totally available through DLPack, I would think this should meet your need? |
For my particular needs I have a workaround already, in general I would think that people that really need this can think of workarounds in some way, but at the cost of some extra lines of code that are going to be specific to one array library. Having |
Sorry but I disagree. If you check the strides obtained through DLPack (see, for example, mpi4py) it's generic to Currently Array API is designed to not standardizing over strides (and thus order) so that people do not have to argue over the execution model. For example, it allows the two extremes, PyTorch (which is always C-order unless you tweak it very hard) and Dask (which has no notion of single-node strides, as it's distributed), to implement Array API support. |
No need to be I by no mean want to imply that the perspective I offer on the matter goes beyond the scope of the limited project I've started working on, that happens to be somewhat related to the Array API, which made me think this proposal could be relevant. If it resonates with other people experience, that would be interesting ! if it's not the case, let it be. Maybe the issue can remain opened for some time to see if there's other feedback ? And as you say maybe I don't need strides at all myself. After all, all arrays can be seen as C-contiguous, it's a matter of defining carefully row and column indexes 🤷 . But then it's an abstraction that must be implemented at the level of my own user code if I really want to and it seems very verbose. In the end I don't think the dlpack interface answers what I'm looking for specifically. Unless I'm missing something, the |
This has been brought up a couple of times before, although I cannot locate the most relevant discussion right now. The problem is mainly that the array API standard wants to only standardize things that are implemented or can be implemented in all array libraries. This is not the case for On top of that I'd say that it's not good UX. The problem with NumPy is that, due to the lack of more advanced facilities for automatically optimizing function calls for performance, it exposed a host of things that really should be implementation details (like views and C/Fortran memory order) directly to end users. For
This is very much true. But it's an implementation detail; how to get optimized code internally within a library should be a separate question from what API is exposed to end users. |
Thanks for your in-depth explanations @rgommers @leofang it makes the choice clear for me.
Couldn't find previous discussions either, this issue now documents the answer :-)
Yes that sounds like the right thing to do. (I hope array libraries that want to expose additional parameters can use the same semantics for similar concepts) |
I hope so too. AFAIK that's the case here, libraries that have the concept follow NumPy in API design I believe. |
Yeah IIRC I raised the discussion on strides during the meetings at least twice, and others did too. I had the same idea in mind as @fcharras did that it'd allow better interoperability with 3rd party (non-array/tensor) libraries. But in the end people (the stakeholders) are happy to settle on this choice for reasons Ralf and I mentioned above 🙂 |
A usecase I'm confronted to is to ensure that a given array has either a C or F memory layout.
The layout can be important when optimizing memory access patterns in lower level code. For instance in
sklearn
KMeans
requiresC
order. Typicallyxp.asarray(X, order=order)
is used to check that the input order matches the required order (ifcopy=False
), or will trigger a copy if necessary to get the expected order (ifcopy!=False
), before passing the array to lower level code (e.g cython or jitted functions).Here's an example of API implemented by
dpctl.tensor
, an array-API compatible array library, that also exposes theorder
parameter toasarray
.The text was updated successfully, but these errors were encountered: