Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Upgrade array-api-tests #27081

Merged
merged 3 commits into from
Aug 1, 2024
Merged

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Jul 30, 2024

Fixes #27072 and increases array-api-tests suite --max-examples to 50.

@mtsokol mtsokol added the 40 - array API standard PRs and issues related to support for the array API standard label Jul 30, 2024
@mtsokol mtsokol self-assigned this Jul 30, 2024
@mtsokol mtsokol force-pushed the array-api-ci-update branch from 9e6541c to 785e4d7 Compare July 30, 2024 13:08
@mtsokol mtsokol force-pushed the array-api-ci-update branch from 785e4d7 to 84ca719 Compare July 30, 2024 14:31
@mtsokol mtsokol linked an issue Jul 30, 2024 that may be closed by this pull request
array_api_tests/test_signatures.py::test_func_signature[argsort]
array_api_tests/test_signatures.py::test_func_signature[sort]

# missing 'descending' keyword argument
array_api_tests/test_sorting_functions.py::test_argsort
array_api_tests/test_sorting_functions.py::test_sort
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of these are bugs in the test suite but this looks like one thing that is actually missing from NumPy for array API compatibility.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but we decided that it would be too much churn to include it now.

@ngoldbaum
Copy link
Member

The tests only take 26 seconds to run - maybe a value higher than 50? I don't have a good intuition about what a reasonable number to use here is, or whether it should be time-bound instead.

@asmeurer
Copy link
Member

The default value, which comes from hypothesis, is 100.

@charris charris changed the title CI: Upgrade array-api-tests CI: Upgrade array-api-tests Jul 31, 2024
@mtsokol
Copy link
Member Author

mtsokol commented Jul 31, 2024

I increased max-examples to 100 and there's one more test failure:

FAILED array_api_tests/test_operators_and_elementwise_functions.py::test_clip - AssertionError: out.dtype=uint32, but should be uint16 [clip(uint16)]
Falsifying example: test_clip(
    x=array(0, dtype=uint16),
    data=data(...),
)
Draw 1 (min.shape, max.shape): ((), ())
Draw 2 (min): 0
Draw 3 (max): array(0, dtype=uint32)
Draw 4 (kwargs): {'min': 0, 'max': array(0, dtype=uint32)}

And I wonder if it's more an issue on the array-api-tests side. The standard says:

If x and either min or max have different data type kinds (e.g., integer versus floating-point), behavior is unspecified and thus implementation-dependent.

And the failing example is:

In [30]: x = np.array(0, dtype=np.uint16)

In [31]: np.clip(x, min=0, max=np.uint32(0))
Out[31]: np.uint32(0)

I think promotion here is intentional and if handling mixed data types of x and min/max is implementation-dependent then array-api-tests shouldn't require x.dtype == np.clip(x, ...).dtype here. WDYT?

@ngoldbaum
Copy link
Member

On a hunch I just tried max_examples=1000, which also returned:

FAILED array_api_tests/test_statistical_functions.py::test_sum - AssertionError: out=2199023255552.0, but should be roughly 3298534883328.0 [sum()]
    @pytest.mark.unvectorized
>   @given(
        x=hh.arrays(
            dtype=hh.numeric_dtypes,
            shape=hh.shapes(min_side=1),
            elements={"allow_nan": False},
        ),
        data=st.data(),
    )

array_api_tests/test_statistical_functions.py:269:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

x = array([ 1.8446743e+19,  0.0000000e+00,  1.8446742e+19,  0.0000000e+00,
       -1.8446741e+19, -1.8446741e+19], dtype=float32), data = data(...)

    @pytest.mark.unvectorized
    @given(
        x=hh.arrays(
            dtype=hh.numeric_dtypes,
            shape=hh.shapes(min_side=1),
            elements={"allow_nan": False},
        ),
        data=st.data(),
    )
    def test_sum(x, data):
        kw = data.draw(
            hh.kwargs(
                axis=hh.axes(x.ndim),
                dtype=kwarg_dtypes(x.dtype),
                keepdims=st.booleans(),
            ),
            label="kw",
        )
        keepdims = kw.get("keepdims", False)

        with hh.reject_overflow():
            out = xp.sum(x, **kw)

        dtype = kw.get("dtype", None)
        expected_dtype = dh.accumulation_result_dtype(x.dtype, dtype)
        if expected_dtype is None:
            # If a default uint cannot exist (i.e. in PyTorch which doesn't support
            # uint32 or uint64), we skip testing the output dtype.
            # See https://github.com/data-apis/array-api-tests/issues/160
            if x.dtype in dh.uint_dtypes:
                assert dh.is_int_dtype(out.dtype)  # sanity check
        else:
            ph.assert_dtype("sum", in_dtype=x.dtype, out_dtype=out.dtype, expected=expected_dtype)
        _axes = sh.normalize_axis(kw.get("axis", None), x.ndim)
        ph.assert_keepdimable_shape(
            "sum", in_shape=x.shape, out_shape=out.shape, axes=_axes, keepdims=keepdims, kw=kw
        )
        scalar_type = dh.get_scalar_type(out.dtype)
        for indices, out_idx in zip(sh.axes_ndindex(x.shape, _axes), sh.ndindex(out.shape)):
            sum_ = scalar_type(out[out_idx])
            assume(cmath.isfinite(sum_))
            elements = []
            for idx in indices:
                s = scalar_type(x[idx])
                elements.append(s)
            expected = sum(elements)
            if dh.is_int_dtype(out.dtype):
                m, M = dh.dtype_ranges[out.dtype]
                assume(m <= expected <= M)
>           ph.assert_scalar_equals("sum", type_=scalar_type, idx=out_idx, out=sum_, expected=expected)
E           AssertionError: out=2199023255552.0, but should be roughly 3298534883328.0 [sum()]
E           Falsifying example: test_sum(
E               x=array([ 1.8446743e+19,  0.0000000e+00,  1.8446742e+19,  0.0000000e+00,
E                      -1.8446741e+19, -1.8446741e+19], dtype=float32),
E               data=data(...),
E           )
E           Draw 1 (kw): {'axis': None}

array_api_tests/test_statistical_functions.py:317: AssertionError

@asmeurer
Copy link
Member

asmeurer commented Jul 31, 2024

The sum thing is indeed a known issue with the test suite. Here's a more accurate set of xfails for numpy https://github.com/data-apis/array-api-compat/blob/main/numpy-dev-xfails.txt (that should also include cumulative_sum alongside sum and prod if you are enabling 2023 tests).

The clip thing is correct. The standard says that clip should return the same dtype as the first argument. The text you quoted is irrelevant because there is no mixing of data type kinds in that example (everything is integer). See #24976 for more details.

@mtsokol mtsokol force-pushed the array-api-ci-update branch from 9a3b3a0 to bdd3e92 Compare August 1, 2024 08:34
@mtsokol
Copy link
Member Author

mtsokol commented Aug 1, 2024

Ok, then let's keep 100 for now as it's the default.

I added another xfail for clip error, so the job passes now. I will address this undesired clip casting in a separate PR. The PR is ready from my side.

@ngoldbaum
Copy link
Member

Maybe you should be using the skips file maintained by the array-api-tests repo like Aaron suggested? Since it captures issues discovered upstream.

@mtsokol
Copy link
Member Author

mtsokol commented Aug 1, 2024

Maybe you should be using the skips file maintained by the array-api-tests repo like Aaron suggested? Since it captures issues discovered upstream.

In this PR I'm using --xfails-file feature instead of --skips-file. This means tools/ci/array-api-xfails.txt tests are run and verified that they fail. If a particular test passes then the test suite reports an XPASS error. Therefore I can't extent the xfails files with array-api-tests counterpart as we would xfail passing tests.

I think it's more accurate to use --xfails-file because we make sure that these tests fail indeed instead of just skipping them.

@ngoldbaum
Copy link
Member

Sorry, I mean the xfails
file maintained upstream: https://github.com/data-apis/array-api-compat/blob/main/numpy-dev-xfails.txt

@mtsokol
Copy link
Member Author

mtsokol commented Aug 1, 2024

I think this file is outdated, it contains non-existing tests. Here I updated it: data-apis/array-api-compat#171

@ngoldbaum
Copy link
Member

If you can't just immediately use that let's move to using the upstream file in a followup. Thanks @mtsokol!

@ngoldbaum ngoldbaum merged commit 8351d3f into numpy:main Aug 1, 2024
66 checks passed
@mtsokol mtsokol deleted the array-api-ci-update branch August 1, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
40 - array API standard PRs and issues related to support for the array API standard
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: astype() function does not work on scalars
3 participants