Merge pull request #9 from asmeurer/docs-update

asmeurer · web-flow · commit 397713fda787 · 2024-01-24T13:45:42.000-07:00
Update documentation
diff --git a/.gitignore b/.gitignore
@@ -158,3 +158,5 @@ cython_debug/
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
+
+.DS_Store
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,30 @@
+# array-api-strict Changelog
+
+## 1.0 (????)
+
+This is the first release of `array_api_strict`. It is extracted from
+`numpy.array_api`, which was included as an experimental submodule in NumPy
+versions prior to 2.0. Note that the commit history in this repository is
+extracted from the git history of numpy/array_api/ (see the [README](README.md)).
+
+Additionally, the following changes are new to `array_api_strict` from
+`numpy.array_api` in NumPy 1.26 (the last NumPy feature release to include
+`numpy.array_api`):
+
+- ``array_api_strict`` was made more portable. In particular:
+
+  - ``array_api_strict`` no longer uses ``"cpu"`` as its "device", but rather a
+    separate ``CPU_DEVICE`` object (which is not accessible in the namespace).
+    This is because "cpu" is not part of the array API standard.
+
+  - ``array_api_strict`` now uses separate wrapped objects for dtypes.
+    Previously it reused the ``numpy`` dtype objects. This makes it clear
+    which behaviors on dtypes are part of the array API standard (effectively,
+    the standard only requires ``==`` on dtype objects).
+
+- ``numpy.array_api.nonzero`` now errors on zero-dimensional arrays, as
+    required by the array API standard.
+
+- Support for the optional [fft
+  extension](https://data-apis.org/array-api/latest/extensions/fourier_transform_functions.html)
+  was added.
diff --git a/README.md b/README.md
@@ -1,13 +1,193 @@
-array-api-strict
-================
+# array-api-strict
 
-A strict, minimal implementation of the [Python array
+`array_api_strict` is a strict, minimal implementation of the [Python array
 API](https://data-apis.org/array-api/latest/)
 
+The purpose of array-api-strict is to provide an implementation of the array
+API for consuming libraries to test against so they can be completely sure
+their usage of the array API is portable.
+
+It is *not* intended to be used by end-users. End-users of the array API
+should just use their favorite array library (NumPy, CuPy, PyTorch, etc.) as
+usual. It is also not intended to be used as a dependency by consuming
+libraries. Consuming library code should use the
+[array-api-compat](https://github.com/data-apis/array-api-compat) package to
+support the array API. Rather, it is intended to be used in the test suites of
+consuming libraries to test their array API usage.
+
+## Install
+
+`array-api-strict` is available on both
+[PyPI](https://pypi.org/project/array-api-strict/)
+
+```
+python -m pip install array-api-strict
+```
+
+and [Conda-forge](https://anaconda.org/conda-forge/array-api-strict)
+
+```
+conda install --channel conda-forge array-api-strict
+```
+
+array-api-strict supports NumPy 1.26 and (the upcoming) NumPy 2.0.
+
+## Rationale
+
+The array API has many functions and behaviors that are required to be
+implemented by conforming libraries, but it does not, in most cases, disallow
+implementing additional functions, keyword arguments, and behaviors that
+aren't explicitly required by the standard.
+
+However, this poses a problem for consumers of the array API, as they may
+accidentally use a function or rely on a behavior which just happens to be
+implemented in every array library they test against (e.g., NumPy and
+PyTorch), but isn't required by the standard and may not be included in other
+libraries.
+
+array-api-strict solves this problem by providing a strict, minimal
+implementation of the array API standard. Only those functions and behaviors
+that are explicitly *required* by the standard are implemented. For example,
+most NumPy functions accept Python scalars as inputs:
+
+```py
+>>> import numpy as np
+>>> np.sin(0.0)
+0.0
+```
+
+However, the standard only specifies function inputs on `Array` objects. And
+indeed, some libraries, such as PyTorch, do not allow this:
+
+```py
+>>> import torch
+>>> torch.sin(0.0)
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: sin(): argument 'input' (position 1) must be Tensor, not float
+```
+
+In array-api-strict, this is also an error:
+
+```py
+>>> import array_api_strict as xp
+>>> xp.sin(0.0)
+Traceback (most recent call last):
+...
+AttributeError: 'float' object has no attribute 'dtype'
+```
+
+Here is an (incomplete) list of the sorts of ways that array-api-strict is
+strict/minimal:
+
+- Only those functions and methods that are [defined in the
+  standard](https://data-apis.org/array-api/latest/API_specification/index.html)
+  are included.
+
+- In those functions, only the keyword-arguments that are defined by the
+  standard are included. All signatures in array-api-strict use
+  [positional-only
+  arguments](https://data-apis.org/array-api/latest/API_specification/function_and_method_signatures.html#function-and-method-signatures).
+  As noted above, only `array_api_strict` array objects are accepted by
+  functions, except in the places where the standard allows Python scalars
+  (i.e., functions do not automatically call `asarray` on their inputs).
+
+- Only those [dtypes that are defined in the
+  standard](https://data-apis.org/array-api/latest/API_specification/data_types.html)
+  are included.
+
+- All functions and methods reject inputs if the standard does not *require*
+  the input dtype(s) to be supported. This is one of the most restrictive
+  aspects of the library. For example, in NumPy, most transcendental functions
+  like `sin` will accept integer array inputs, but the [standard only requires
+  them to accept floating-point
+  inputs](https://data-apis.org/array-api/latest/API_specification/generated/array_api.sin.html#array_api.sin),
+  so in array-api-strict, `sin(integer_array)` will raise an exception.
+
+- The
+  [indexing](https://data-apis.org/array-api/latest/API_specification/indexing.html)
+  semantics required by the standard are limited compared to those implemented
+  by NumPy (e.g., out-of-bounds slices are not supported, integer array
+  indexing is not supported, only a single boolean array index is supported).
+
+- There are no distinct "scalar" objects as in NumPy. There are only 0-D
+  arrays.
+
+- Dtype objects are just empty objects that only implement [equality
+  comparison](https://data-apis.org/array-api/latest/API_specification/generated/array_api.data_types.__eq__.html).
+  The way to access dtype objects in the standard is by name, like
+  `xp.float32`.
+
+- The array object type itself is private and should not be accessed.
+  Subclassing or otherwise trying to directly initialize this object is not
+  supported. Arrays should be created with one of the [array creation
+  functions](https://data-apis.org/array-api/latest/API_specification/creation_functions.html)
+  such as `asarray`.
+
+## Caveats
+
+array-api-strict is a thin pure Python wrapper around NumPy. NumPy 2.0 fully
+supports the array API but NumPy 1.26 does not, so many behaviors are wrapped
+in NumPy 1.26 to provide array API compatible behavior. Although it is based
+on NumPy, mixing NumPy arrays with array-api-strict arrays is not supported.
+This should generally raise an error, as it indicates a potential portability
+issue, but this hasn't necessarily been tested thoroughly.
+
+1. array-api-strict is validated against the [array API test
+   suite](https://github.com/data-apis/array-api-tests). However, there may be
+   a few minor instances where NumPy deviates from the standard in a way that
+   is inconvenient to workaround in array-api-strict, since it aims to remain
+   pure Python. You can see the full list of tests that are known to fail in
+   the [xfails
+   file](https://github.com/data-apis/array-api-strict/blob/main/array-api-tests-xfails.txt).
+
+    The most notable of these is that in NumPy 1.26, the `copy=False` flag is
+    not implemented for `asarray` and therefore `array_api_strict` raises
+    `NotImplementedError` in that case.
+
+2. Since NumPy is a CPU-only library, the [device
+   support](https://data-apis.org/array-api/latest/design_topics/device_support.html)
+   in array-api-strict is superficial only. `x.device` is always a (private)
+   `CPU_DEVICE` object, and `device` keywords to creation functions only
+   accept either this object or `None`. A future version of array-api-strict
+   [may add support for a CuPy
+   backend](https://github.com/data-apis/array-api-strict/issues/5) so that
+   more significant device support can be tested.
+
+3. Although only array types are expected in array-api-strict functions,
+   currently most functions do not do extensive type checking on their inputs,
+   so a sufficiently duck-typed object may pass through silently (or at best,
+   you may get `AttributeError` instead of `TypeError`). However, all type
+   signatures have type annotations (based on those from the standard), so
+   this deviation may be tested with type checking. This [behavior may improve
+   in the future](https://github.com/data-apis/array-api-strict/issues/6).
+
+4. There are some behaviors in the standard that are not required to be
+   implemented by libraries that cannot support [data dependent
+   shapes](https://data-apis.org/array-api/latest/design_topics/data_dependent_output_shapes.html).
+   This includes [the `unique_*`
+   functions](https://data-apis.org/array-api/latest/API_specification/set_functions.html),
+   [boolean array
+   indexing](https://data-apis.org/array-api/latest/API_specification/indexing.html#boolean-array-indexing),
+   and the
+   [`nonzero`](https://data-apis.org/array-api/latest/API_specification/generated/array_api.nonzero.html)
+   function. array-api-strict currently implements all of these. In the
+   future, [there may be a way to disable them](https://github.com/data-apis/array-api-strict/issues/7).
+
+5. array-api-strict currently only supports the latest version of the array
+   API standard. [This may change in the future depending on
+   need](https://github.com/data-apis/array-api-strict/issues/8).
+
+## Usage
+
+TODO: Add a sample CI script here.
+
+## Relationship to `numpy.array_api`
+
 Previously this implementation was available as `numpy.array_api`, but it was
 moved to a separate package for NumPy 2.0.
 
-Note: the history of this repo prior to commit
+Note that the history of this repo prior to commit
 fbefd42e4d11e9be20e0a4785f2619fc1aef1e7c was generated automatically
 from the numpy git history, using the following
 [git-filter-repo](https://github.com/newren/git-filter-repo) command:
diff --git a/array_api_strict/__init__.py b/array_api_strict/__init__.py
@@ -1,117 +1,18 @@
 """
-A NumPy sub-namespace that conforms to the Python array API standard.
-
-This submodule accompanies NEP 47, which proposes its inclusion in NumPy. It
-is still considered experimental, and will issue a warning when imported.
-
-This is a proof-of-concept namespace that wraps the corresponding NumPy
-functions to give a conforming implementation of the Python array API standard
-(https://data-apis.github.io/array-api/latest/). The standard is currently in
-an RFC phase and comments on it are both welcome and encouraged. Comments
-should be made either at https://github.com/data-apis/array-api or at
-https://github.com/data-apis/consortium-feedback/discussions.
-
-NumPy already follows the proposed spec for the most part, so this module
-serves mostly as a thin wrapper around it. However, NumPy also implements a
-lot of behavior that is not included in the spec, so this serves as a
-restricted subset of the API. Only those functions that are part of the spec
-are included in this namespace, and all functions are given with the exact
-signature given in the spec, including the use of position-only arguments, and
-omitting any extra keyword arguments implemented by NumPy but not part of the
-spec. The behavior of some functions is also modified from the NumPy behavior
-to conform to the standard. Note that the underlying array object itself is
-wrapped in a wrapper Array() class, but is otherwise unchanged. This submodule
-is implemented in pure Python with no C extensions.
-
-The array API spec is designed as a "minimal API subset" and explicitly allows
-libraries to include behaviors not specified by it. But users of this module
-that intend to write portable code should be aware that only those behaviors
-that are listed in the spec are guaranteed to be implemented across libraries.
-Consequently, the NumPy implementation was chosen to be both conforming and
-minimal, so that users can use this implementation of the array API namespace
-and be sure that behaviors that it defines will be available in conforming
-namespaces from other libraries.
-
-A few notes about the current state of this submodule:
-
-- There is a test suite that tests modules against the array API standard at
-  https://github.com/data-apis/array-api-tests. The test suite is still a work
-  in progress, but the existing tests pass on this module, with a few
-  exceptions:
-
-  - DLPack support (see https://github.com/data-apis/array-api/pull/106) is
-    not included here, as it requires a full implementation in NumPy proper
-    first.
-
-  The test suite is not yet complete, and even the tests that exist are not
-  guaranteed to give a comprehensive coverage of the spec. Therefore, when
-  reviewing and using this submodule, you should refer to the standard
-  documents themselves. There are some tests in array_api_strict.tests, but
-  they primarily focus on things that are not tested by the official array API
-  test suite.
-
-- There is a custom array object, array_api_strict.Array, which is returned by
-  all functions in this module. All functions in the array API namespace
-  implicitly assume that they will only receive this object as input. The only
-  way to create instances of this object is to use one of the array creation
-  functions. It does not have a public constructor on the object itself. The
-  object is a small wrapper class around numpy.ndarray. The main purpose of it
-  is to restrict the namespace of the array object to only those dtypes and
-  only those methods that are required by the spec, as well as to limit/change
-  certain behavior that differs in the spec. In particular:
-
-  - The array API namespace does not have scalar objects, only 0-D arrays.
-    Operations on Array that would create a scalar in NumPy create a 0-D
-    array.
-
-  - Indexing: Only a subset of indices supported by NumPy are required by the
-    spec. The Array object restricts indexing to only allow those types of
-    indices that are required by the spec. See the docstring of the
-    array_api_strict.Array._validate_indices helper function for more
-    information.
-
-  - Type promotion: Some type promotion rules are different in the spec. In
-    particular, the spec does not have any value-based casting. The spec also
-    does not require cross-kind casting, like integer -> floating-point. Only
-    those promotions that are explicitly required by the array API
-    specification are allowed in this module. See NEP 47 for more info.
-
-  - Functions do not automatically call asarray() on their input, and will not
-    work if the input type is not Array. The exception is array creation
-    functions, and Python operators on the Array object, which accept Python
-    scalars of the same type as the array dtype.
-
-- All functions include type annotations, corresponding to those given in the
-  spec (see _typing.py for definitions of some custom types). These do not
-  currently fully pass mypy due to some limitations in mypy.
-
-- Dtype objects are just the NumPy dtype objects, e.g., float64 =
-  np.dtype('float64'). The spec does not require any behavior on these dtype
-  objects other than that they be accessible by name and be comparable by
-  equality, but it was considered too much extra complexity to create custom
-  objects to represent dtypes.
-
-- All places where the implementations in this submodule are known to deviate
-  from their corresponding functions in NumPy are marked with "# Note:"
-  comments.
-
-Still TODO in this module are:
-
-- DLPack support for numpy.ndarray is still in progress. See
-  https://github.com/numpy/numpy/pull/19083.
-
-- The copy=False keyword argument to asarray() is not yet implemented. This
-  requires support in numpy.asarray() first.
-
-- Some functions are not yet fully tested in the array API test suite, and may
-  require updates that are not yet known until the tests are written.
-
-- The spec is still in an RFC phase and may still have minor updates, which
-  will need to be reflected here.
-
-- Complex number support in array API spec is planned but not yet finalized,
-  as are the fft extension and certain linear algebra functions such as eig
-  that require complex dtypes.
+array_api_strict is a strict, minimal implementation of the Python array
+API (https://data-apis.org/array-api/latest/)
+
+The purpose of array-api-strict is to provide an implementation of the array
+API for consuming libraries to test against so they can be completely sure
+their usage of the array API is portable.
+
+It is *not* intended to be used by end-users. End-users of the array API
+should just use their favorite array library (NumPy, CuPy, PyTorch, etc.) as
+usual. It is also not intended to be used as a dependency by consuming
+libraries. Consuming library code should use the
+array-api-compat (https://github.com/data-apis/array-api-compat) package to
+support the array API. Rather, it is intended to be used in the test suites of
+consuming libraries to test their array API usage.
 
 """