diff --git a/.gitignore b/.gitignore index 68bc17f..dbce267 100644 --- a/.gitignore +++ b/.gitignore @@ -158,3 +158,5 @@ cython_debug/ # and can be added to the global gitignore or merged into this file. For a more nuclear # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ + +.DS_Store diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..be40bd6 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,30 @@ +# array-api-strict Changelog + +## 1.0 (????) + +This is the first release of `array_api_strict`. It is extracted from +`numpy.array_api`, which was included as an experimental submodule in NumPy +versions prior to 2.0. Note that the commit history in this repository is +extracted from the git history of numpy/array_api/ (see the [README](README.md)). + +Additionally, the following changes are new to `array_api_strict` from +`numpy.array_api` in NumPy 1.26 (the last NumPy feature release to include +`numpy.array_api`): + +- ``array_api_strict`` was made more portable. In particular: + + - ``array_api_strict`` no longer uses ``"cpu"`` as its "device", but rather a + separate ``CPU_DEVICE`` object (which is not accessible in the namespace). + This is because "cpu" is not part of the array API standard. + + - ``array_api_strict`` now uses separate wrapped objects for dtypes. + Previously it reused the ``numpy`` dtype objects. This makes it clear + which behaviors on dtypes are part of the array API standard (effectively, + the standard only requires ``==`` on dtype objects). + +- ``numpy.array_api.nonzero`` now errors on zero-dimensional arrays, as + required by the array API standard. + +- Support for the optional [fft + extension](https://data-apis.org/array-api/latest/extensions/fourier_transform_functions.html) + was added. diff --git a/README.md b/README.md index 2a15cac..dcf6b13 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,193 @@ -array-api-strict -================ +# array-api-strict -A strict, minimal implementation of the [Python array +`array_api_strict` is a strict, minimal implementation of the [Python array API](https://data-apis.org/array-api/latest/) +The purpose of array-api-strict is to provide an implementation of the array +API for consuming libraries to test against so they can be completely sure +their usage of the array API is portable. + +It is *not* intended to be used by end-users. End-users of the array API +should just use their favorite array library (NumPy, CuPy, PyTorch, etc.) as +usual. It is also not intended to be used as a dependency by consuming +libraries. Consuming library code should use the +[array-api-compat](https://github.com/data-apis/array-api-compat) package to +support the array API. Rather, it is intended to be used in the test suites of +consuming libraries to test their array API usage. + +## Install + +`array-api-strict` is available on both +[PyPI](https://pypi.org/project/array-api-strict/) + +``` +python -m pip install array-api-strict +``` + +and [Conda-forge](https://anaconda.org/conda-forge/array-api-strict) + +``` +conda install --channel conda-forge array-api-strict +``` + +array-api-strict supports NumPy 1.26 and (the upcoming) NumPy 2.0. + +## Rationale + +The array API has many functions and behaviors that are required to be +implemented by conforming libraries, but it does not, in most cases, disallow +implementing additional functions, keyword arguments, and behaviors that +aren't explicitly required by the standard. + +However, this poses a problem for consumers of the array API, as they may +accidentally use a function or rely on a behavior which just happens to be +implemented in every array library they test against (e.g., NumPy and +PyTorch), but isn't required by the standard and may not be included in other +libraries. + +array-api-strict solves this problem by providing a strict, minimal +implementation of the array API standard. Only those functions and behaviors +that are explicitly *required* by the standard are implemented. For example, +most NumPy functions accept Python scalars as inputs: + +```py +>>> import numpy as np +>>> np.sin(0.0) +0.0 +``` + +However, the standard only specifies function inputs on `Array` objects. And +indeed, some libraries, such as PyTorch, do not allow this: + +```py +>>> import torch +>>> torch.sin(0.0) +Traceback (most recent call last): + File "", line 1, in +TypeError: sin(): argument 'input' (position 1) must be Tensor, not float +``` + +In array-api-strict, this is also an error: + +```py +>>> import array_api_strict as xp +>>> xp.sin(0.0) +Traceback (most recent call last): +... +AttributeError: 'float' object has no attribute 'dtype' +``` + +Here is an (incomplete) list of the sorts of ways that array-api-strict is +strict/minimal: + +- Only those functions and methods that are [defined in the + standard](https://data-apis.org/array-api/latest/API_specification/index.html) + are included. + +- In those functions, only the keyword-arguments that are defined by the + standard are included. All signatures in array-api-strict use + [positional-only + arguments](https://data-apis.org/array-api/latest/API_specification/function_and_method_signatures.html#function-and-method-signatures). + As noted above, only `array_api_strict` array objects are accepted by + functions, except in the places where the standard allows Python scalars + (i.e., functions do not automatically call `asarray` on their inputs). + +- Only those [dtypes that are defined in the + standard](https://data-apis.org/array-api/latest/API_specification/data_types.html) + are included. + +- All functions and methods reject inputs if the standard does not *require* + the input dtype(s) to be supported. This is one of the most restrictive + aspects of the library. For example, in NumPy, most transcendental functions + like `sin` will accept integer array inputs, but the [standard only requires + them to accept floating-point + inputs](https://data-apis.org/array-api/latest/API_specification/generated/array_api.sin.html#array_api.sin), + so in array-api-strict, `sin(integer_array)` will raise an exception. + +- The + [indexing](https://data-apis.org/array-api/latest/API_specification/indexing.html) + semantics required by the standard are limited compared to those implemented + by NumPy (e.g., out-of-bounds slices are not supported, integer array + indexing is not supported, only a single boolean array index is supported). + +- There are no distinct "scalar" objects as in NumPy. There are only 0-D + arrays. + +- Dtype objects are just empty objects that only implement [equality + comparison](https://data-apis.org/array-api/latest/API_specification/generated/array_api.data_types.__eq__.html). + The way to access dtype objects in the standard is by name, like + `xp.float32`. + +- The array object type itself is private and should not be accessed. + Subclassing or otherwise trying to directly initialize this object is not + supported. Arrays should be created with one of the [array creation + functions](https://data-apis.org/array-api/latest/API_specification/creation_functions.html) + such as `asarray`. + +## Caveats + +array-api-strict is a thin pure Python wrapper around NumPy. NumPy 2.0 fully +supports the array API but NumPy 1.26 does not, so many behaviors are wrapped +in NumPy 1.26 to provide array API compatible behavior. Although it is based +on NumPy, mixing NumPy arrays with array-api-strict arrays is not supported. +This should generally raise an error, as it indicates a potential portability +issue, but this hasn't necessarily been tested thoroughly. + +1. array-api-strict is validated against the [array API test + suite](https://github.com/data-apis/array-api-tests). However, there may be + a few minor instances where NumPy deviates from the standard in a way that + is inconvenient to workaround in array-api-strict, since it aims to remain + pure Python. You can see the full list of tests that are known to fail in + the [xfails + file](https://github.com/data-apis/array-api-strict/blob/main/array-api-tests-xfails.txt). + + The most notable of these is that in NumPy 1.26, the `copy=False` flag is + not implemented for `asarray` and therefore `array_api_strict` raises + `NotImplementedError` in that case. + +2. Since NumPy is a CPU-only library, the [device + support](https://data-apis.org/array-api/latest/design_topics/device_support.html) + in array-api-strict is superficial only. `x.device` is always a (private) + `CPU_DEVICE` object, and `device` keywords to creation functions only + accept either this object or `None`. A future version of array-api-strict + [may add support for a CuPy + backend](https://github.com/data-apis/array-api-strict/issues/5) so that + more significant device support can be tested. + +3. Although only array types are expected in array-api-strict functions, + currently most functions do not do extensive type checking on their inputs, + so a sufficiently duck-typed object may pass through silently (or at best, + you may get `AttributeError` instead of `TypeError`). However, all type + signatures have type annotations (based on those from the standard), so + this deviation may be tested with type checking. This [behavior may improve + in the future](https://github.com/data-apis/array-api-strict/issues/6). + +4. There are some behaviors in the standard that are not required to be + implemented by libraries that cannot support [data dependent + shapes](https://data-apis.org/array-api/latest/design_topics/data_dependent_output_shapes.html). + This includes [the `unique_*` + functions](https://data-apis.org/array-api/latest/API_specification/set_functions.html), + [boolean array + indexing](https://data-apis.org/array-api/latest/API_specification/indexing.html#boolean-array-indexing), + and the + [`nonzero`](https://data-apis.org/array-api/latest/API_specification/generated/array_api.nonzero.html) + function. array-api-strict currently implements all of these. In the + future, [there may be a way to disable them](https://github.com/data-apis/array-api-strict/issues/7). + +5. array-api-strict currently only supports the latest version of the array + API standard. [This may change in the future depending on + need](https://github.com/data-apis/array-api-strict/issues/8). + +## Usage + +TODO: Add a sample CI script here. + +## Relationship to `numpy.array_api` + Previously this implementation was available as `numpy.array_api`, but it was moved to a separate package for NumPy 2.0. -Note: the history of this repo prior to commit +Note that the history of this repo prior to commit fbefd42e4d11e9be20e0a4785f2619fc1aef1e7c was generated automatically from the numpy git history, using the following [git-filter-repo](https://github.com/newren/git-filter-repo) command: diff --git a/array_api_strict/__init__.py b/array_api_strict/__init__.py index b3c22c8..326f55d 100644 --- a/array_api_strict/__init__.py +++ b/array_api_strict/__init__.py @@ -1,117 +1,18 @@ """ -A NumPy sub-namespace that conforms to the Python array API standard. - -This submodule accompanies NEP 47, which proposes its inclusion in NumPy. It -is still considered experimental, and will issue a warning when imported. - -This is a proof-of-concept namespace that wraps the corresponding NumPy -functions to give a conforming implementation of the Python array API standard -(https://data-apis.github.io/array-api/latest/). The standard is currently in -an RFC phase and comments on it are both welcome and encouraged. Comments -should be made either at https://github.com/data-apis/array-api or at -https://github.com/data-apis/consortium-feedback/discussions. - -NumPy already follows the proposed spec for the most part, so this module -serves mostly as a thin wrapper around it. However, NumPy also implements a -lot of behavior that is not included in the spec, so this serves as a -restricted subset of the API. Only those functions that are part of the spec -are included in this namespace, and all functions are given with the exact -signature given in the spec, including the use of position-only arguments, and -omitting any extra keyword arguments implemented by NumPy but not part of the -spec. The behavior of some functions is also modified from the NumPy behavior -to conform to the standard. Note that the underlying array object itself is -wrapped in a wrapper Array() class, but is otherwise unchanged. This submodule -is implemented in pure Python with no C extensions. - -The array API spec is designed as a "minimal API subset" and explicitly allows -libraries to include behaviors not specified by it. But users of this module -that intend to write portable code should be aware that only those behaviors -that are listed in the spec are guaranteed to be implemented across libraries. -Consequently, the NumPy implementation was chosen to be both conforming and -minimal, so that users can use this implementation of the array API namespace -and be sure that behaviors that it defines will be available in conforming -namespaces from other libraries. - -A few notes about the current state of this submodule: - -- There is a test suite that tests modules against the array API standard at - https://github.com/data-apis/array-api-tests. The test suite is still a work - in progress, but the existing tests pass on this module, with a few - exceptions: - - - DLPack support (see https://github.com/data-apis/array-api/pull/106) is - not included here, as it requires a full implementation in NumPy proper - first. - - The test suite is not yet complete, and even the tests that exist are not - guaranteed to give a comprehensive coverage of the spec. Therefore, when - reviewing and using this submodule, you should refer to the standard - documents themselves. There are some tests in array_api_strict.tests, but - they primarily focus on things that are not tested by the official array API - test suite. - -- There is a custom array object, array_api_strict.Array, which is returned by - all functions in this module. All functions in the array API namespace - implicitly assume that they will only receive this object as input. The only - way to create instances of this object is to use one of the array creation - functions. It does not have a public constructor on the object itself. The - object is a small wrapper class around numpy.ndarray. The main purpose of it - is to restrict the namespace of the array object to only those dtypes and - only those methods that are required by the spec, as well as to limit/change - certain behavior that differs in the spec. In particular: - - - The array API namespace does not have scalar objects, only 0-D arrays. - Operations on Array that would create a scalar in NumPy create a 0-D - array. - - - Indexing: Only a subset of indices supported by NumPy are required by the - spec. The Array object restricts indexing to only allow those types of - indices that are required by the spec. See the docstring of the - array_api_strict.Array._validate_indices helper function for more - information. - - - Type promotion: Some type promotion rules are different in the spec. In - particular, the spec does not have any value-based casting. The spec also - does not require cross-kind casting, like integer -> floating-point. Only - those promotions that are explicitly required by the array API - specification are allowed in this module. See NEP 47 for more info. - - - Functions do not automatically call asarray() on their input, and will not - work if the input type is not Array. The exception is array creation - functions, and Python operators on the Array object, which accept Python - scalars of the same type as the array dtype. - -- All functions include type annotations, corresponding to those given in the - spec (see _typing.py for definitions of some custom types). These do not - currently fully pass mypy due to some limitations in mypy. - -- Dtype objects are just the NumPy dtype objects, e.g., float64 = - np.dtype('float64'). The spec does not require any behavior on these dtype - objects other than that they be accessible by name and be comparable by - equality, but it was considered too much extra complexity to create custom - objects to represent dtypes. - -- All places where the implementations in this submodule are known to deviate - from their corresponding functions in NumPy are marked with "# Note:" - comments. - -Still TODO in this module are: - -- DLPack support for numpy.ndarray is still in progress. See - https://github.com/numpy/numpy/pull/19083. - -- The copy=False keyword argument to asarray() is not yet implemented. This - requires support in numpy.asarray() first. - -- Some functions are not yet fully tested in the array API test suite, and may - require updates that are not yet known until the tests are written. - -- The spec is still in an RFC phase and may still have minor updates, which - will need to be reflected here. - -- Complex number support in array API spec is planned but not yet finalized, - as are the fft extension and certain linear algebra functions such as eig - that require complex dtypes. +array_api_strict is a strict, minimal implementation of the Python array +API (https://data-apis.org/array-api/latest/) + +The purpose of array-api-strict is to provide an implementation of the array +API for consuming libraries to test against so they can be completely sure +their usage of the array API is portable. + +It is *not* intended to be used by end-users. End-users of the array API +should just use their favorite array library (NumPy, CuPy, PyTorch, etc.) as +usual. It is also not intended to be used as a dependency by consuming +libraries. Consuming library code should use the +array-api-compat (https://github.com/data-apis/array-api-compat) package to +support the array API. Rather, it is intended to be used in the test suites of +consuming libraries to test their array API usage. """