CI: Upgrade `array-api-tests` #27081

mtsokol · 2024-07-30T12:02:00Z

Fixes #27072 and increases array-api-tests suite --max-examples to 50.

asmeurer · 2024-07-30T18:26:23Z

tools/ci/array-api-xfails.txt

 array_api_tests/test_signatures.py::test_func_signature[argsort]
 array_api_tests/test_signatures.py::test_func_signature[sort]

+# missing 'descending' keyword argument
+array_api_tests/test_sorting_functions.py::test_argsort
+array_api_tests/test_sorting_functions.py::test_sort


The rest of these are bugs in the test suite but this looks like one thing that is actually missing from NumPy for array API compatibility.

I agree, but we decided that it would be too much churn to include it now.

ngoldbaum · 2024-07-30T19:34:25Z

The tests only take 26 seconds to run - maybe a value higher than 50? I don't have a good intuition about what a reasonable number to use here is, or whether it should be time-bound instead.

asmeurer · 2024-07-30T22:22:58Z

The default value, which comes from hypothesis, is 100.

mtsokol · 2024-07-31T14:30:33Z

I increased max-examples to 100 and there's one more test failure:

FAILED array_api_tests/test_operators_and_elementwise_functions.py::test_clip - AssertionError: out.dtype=uint32, but should be uint16 [clip(uint16)]
Falsifying example: test_clip(
    x=array(0, dtype=uint16),
    data=data(...),
)
Draw 1 (min.shape, max.shape): ((), ())
Draw 2 (min): 0
Draw 3 (max): array(0, dtype=uint32)
Draw 4 (kwargs): {'min': 0, 'max': array(0, dtype=uint32)}

And I wonder if it's more an issue on the array-api-tests side. The standard says:

If x and either min or max have different data type kinds (e.g., integer versus floating-point), behavior is unspecified and thus implementation-dependent.

And the failing example is:

In [30]: x = np.array(0, dtype=np.uint16)

In [31]: np.clip(x, min=0, max=np.uint32(0))
Out[31]: np.uint32(0)

I think promotion here is intentional and if handling mixed data types of x and min/max is implementation-dependent then array-api-tests shouldn't require x.dtype == np.clip(x, ...).dtype here. WDYT?

ngoldbaum · 2024-07-31T16:51:30Z

On a hunch I just tried max_examples=1000, which also returned:

FAILED array_api_tests/test_statistical_functions.py::test_sum - AssertionError: out=2199023255552.0, but should be roughly 3298534883328.0 [sum()]

    @pytest.mark.unvectorized
>   @given(
        x=hh.arrays(
            dtype=hh.numeric_dtypes,
            shape=hh.shapes(min_side=1),
            elements={"allow_nan": False},
        ),
        data=st.data(),
    )

array_api_tests/test_statistical_functions.py:269:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

x = array([ 1.8446743e+19,  0.0000000e+00,  1.8446742e+19,  0.0000000e+00,
       -1.8446741e+19, -1.8446741e+19], dtype=float32), data = data(...)

    @pytest.mark.unvectorized
    @given(
        x=hh.arrays(
            dtype=hh.numeric_dtypes,
            shape=hh.shapes(min_side=1),
            elements={"allow_nan": False},
        ),
        data=st.data(),
    )
    def test_sum(x, data):
        kw = data.draw(
            hh.kwargs(
                axis=hh.axes(x.ndim),
                dtype=kwarg_dtypes(x.dtype),
                keepdims=st.booleans(),
            ),
            label="kw",
        )
        keepdims = kw.get("keepdims", False)

        with hh.reject_overflow():
            out = xp.sum(x, **kw)

        dtype = kw.get("dtype", None)
        expected_dtype = dh.accumulation_result_dtype(x.dtype, dtype)
        if expected_dtype is None:
            # If a default uint cannot exist (i.e. in PyTorch which doesn't support
            # uint32 or uint64), we skip testing the output dtype.
            # See https://github.com/data-apis/array-api-tests/issues/160
            if x.dtype in dh.uint_dtypes:
                assert dh.is_int_dtype(out.dtype)  # sanity check
        else:
            ph.assert_dtype("sum", in_dtype=x.dtype, out_dtype=out.dtype, expected=expected_dtype)
        _axes = sh.normalize_axis(kw.get("axis", None), x.ndim)
        ph.assert_keepdimable_shape(
            "sum", in_shape=x.shape, out_shape=out.shape, axes=_axes, keepdims=keepdims, kw=kw
        )
        scalar_type = dh.get_scalar_type(out.dtype)
        for indices, out_idx in zip(sh.axes_ndindex(x.shape, _axes), sh.ndindex(out.shape)):
            sum_ = scalar_type(out[out_idx])
            assume(cmath.isfinite(sum_))
            elements = []
            for idx in indices:
                s = scalar_type(x[idx])
                elements.append(s)
            expected = sum(elements)
            if dh.is_int_dtype(out.dtype):
                m, M = dh.dtype_ranges[out.dtype]
                assume(m <= expected <= M)
>           ph.assert_scalar_equals("sum", type_=scalar_type, idx=out_idx, out=sum_, expected=expected)
E           AssertionError: out=2199023255552.0, but should be roughly 3298534883328.0 [sum()]
E           Falsifying example: test_sum(
E               x=array([ 1.8446743e+19,  0.0000000e+00,  1.8446742e+19,  0.0000000e+00,
E                      -1.8446741e+19, -1.8446741e+19], dtype=float32),
E               data=data(...),
E           )
E           Draw 1 (kw): {'axis': None}

array_api_tests/test_statistical_functions.py:317: AssertionError

asmeurer · 2024-07-31T18:58:40Z

The sum thing is indeed a known issue with the test suite. Here's a more accurate set of xfails for numpy https://github.com/data-apis/array-api-compat/blob/main/numpy-dev-xfails.txt (that should also include cumulative_sum alongside sum and prod if you are enabling 2023 tests).

The clip thing is correct. The standard says that clip should return the same dtype as the first argument. The text you quoted is irrelevant because there is no mixing of data type kinds in that example (everything is integer). See #24976 for more details.

mtsokol · 2024-08-01T10:12:41Z

Ok, then let's keep 100 for now as it's the default.

I added another xfail for clip error, so the job passes now. I will address this undesired clip casting in a separate PR. The PR is ready from my side.

ngoldbaum · 2024-08-01T15:03:28Z

Maybe you should be using the skips file maintained by the array-api-tests repo like Aaron suggested? Since it captures issues discovered upstream.

mtsokol · 2024-08-01T17:01:18Z

Maybe you should be using the skips file maintained by the array-api-tests repo like Aaron suggested? Since it captures issues discovered upstream.

In this PR I'm using --xfails-file feature instead of --skips-file. This means tools/ci/array-api-xfails.txt tests are run and verified that they fail. If a particular test passes then the test suite reports an XPASS error. Therefore I can't extent the xfails files with array-api-tests counterpart as we would xfail passing tests.

I think it's more accurate to use --xfails-file because we make sure that these tests fail indeed instead of just skipping them.

ngoldbaum · 2024-08-01T17:02:53Z

Sorry, I mean the xfails
file maintained upstream: https://github.com/data-apis/array-api-compat/blob/main/numpy-dev-xfails.txt

mtsokol · 2024-08-01T17:38:43Z

I think this file is outdated, it contains non-existing tests. Here I updated it: data-apis/array-api-compat#171

ngoldbaum · 2024-08-01T17:56:12Z

If you can't just immediately use that let's move to using the upstream file in a followup. Thanks @mtsokol!

mtsokol added the 40 - array API standard PRs and issues related to support for the array API standard label Jul 30, 2024

mtsokol self-assigned this Jul 30, 2024

mtsokol mentioned this pull request Jul 30, 2024

BUG: astype() function does not work on scalars #27072

Closed

mtsokol force-pushed the array-api-ci-update branch from 9e6541c to 785e4d7 Compare July 30, 2024 13:08

CI: Upgrade array-api-tests hash

84ca719

mtsokol force-pushed the array-api-ci-update branch from 785e4d7 to 84ca719 Compare July 30, 2024 14:31

mtsokol linked an issue Jul 30, 2024 that may be closed by this pull request

BUG: astype() function does not work on scalars #27072

Closed

asmeurer reviewed Jul 30, 2024

View reviewed changes

charris changed the title ~~CI: Upgrade array-api-tests~~ CI: Upgrade array-api-tests Jul 31, 2024

CI: Increase max-examples to 100

a975551

CI: Add xfail for clip

bdd3e92

mtsokol force-pushed the array-api-ci-update branch from 9a3b3a0 to bdd3e92 Compare August 1, 2024 08:34

ngoldbaum merged commit 8351d3f into numpy:main Aug 1, 2024
66 checks passed

mtsokol deleted the array-api-ci-update branch August 1, 2024 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Upgrade `array-api-tests` #27081

CI: Upgrade `array-api-tests` #27081

mtsokol commented Jul 30, 2024 •

edited

Loading

asmeurer Jul 30, 2024

mtsokol Jul 30, 2024

ngoldbaum commented Jul 30, 2024

asmeurer commented Jul 30, 2024

mtsokol commented Jul 31, 2024 •

edited

Loading

ngoldbaum commented Jul 31, 2024

asmeurer commented Jul 31, 2024 •

edited

Loading

mtsokol commented Aug 1, 2024

ngoldbaum commented Aug 1, 2024

mtsokol commented Aug 1, 2024

ngoldbaum commented Aug 1, 2024

mtsokol commented Aug 1, 2024

ngoldbaum commented Aug 1, 2024

CI: Upgrade array-api-tests #27081

CI: Upgrade array-api-tests #27081

Conversation

mtsokol commented Jul 30, 2024 • edited Loading

asmeurer Jul 30, 2024

Choose a reason for hiding this comment

mtsokol Jul 30, 2024

Choose a reason for hiding this comment

ngoldbaum commented Jul 30, 2024

asmeurer commented Jul 30, 2024

mtsokol commented Jul 31, 2024 • edited Loading

ngoldbaum commented Jul 31, 2024

asmeurer commented Jul 31, 2024 • edited Loading

mtsokol commented Aug 1, 2024

ngoldbaum commented Aug 1, 2024

mtsokol commented Aug 1, 2024

ngoldbaum commented Aug 1, 2024

mtsokol commented Aug 1, 2024

ngoldbaum commented Aug 1, 2024

CI: Upgrade `array-api-tests` #27081

CI: Upgrade `array-api-tests` #27081

mtsokol commented Jul 30, 2024 •

edited

Loading

mtsokol commented Jul 31, 2024 •

edited

Loading

asmeurer commented Jul 31, 2024 •

edited

Loading