Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data_kind: Add more tests to demonstrate the data kind of various data types #3480

Merged
merged 6 commits into from
Oct 3, 2024

Conversation

seisman
Copy link
Member

@seisman seisman commented Oct 2, 2024

Description of proposed changes

Just add more tests to help us understand the current behavior of the data_kind function.

As can be seen, many data types are recognized as "matrix" and data=None is recognized as "vectors". Sometimes, the result data kind is counterintuitive, as first mentioned in #3351, #3369 and #2744.

This PR only focuses on adding tests. Will refactor the data_kind function in separate PRs.

Please let me if there are other data types that should be tested.

@seisman seisman changed the title data_kind: Add more doctest to demonstrate the data kind of various data types data_kind: Add more tests to demonstrate the data kind of various data types Oct 2, 2024
@seisman seisman force-pushed the data_kind/doctest branch 3 times, most recently from 8021279 to 649469a Compare October 2, 2024 11:44
@seisman seisman force-pushed the data_kind/doctest branch from 649469a to 10a7333 Compare October 2, 2024 11:49
@seisman seisman marked this pull request as ready for review October 2, 2024 11:53
@seisman seisman added maintenance Boring but important stuff for the core devs skip-changelog Skip adding Pull Request to changelog needs review This PR has higher priority and needs review. labels Oct 2, 2024
@seisman seisman added this to the 0.14.0 milestone Oct 2, 2024
pygmt/helpers/utils.py Outdated Show resolved Hide resolved
pygmt/helpers/utils.py Outdated Show resolved Hide resolved
Comment on lines +249 to 252
>>> data_kind(data=xr.DataArray(np.arange(12))) # 1-D xarray.DataArray
'grid'
>>> data_kind(data=xr.DataArray(np.random.rand(2, 3, 4, 5))) # 4-D xarray.DataArray
'grid'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erm, do we need tests to check that 1-D and 4-D xarray.DataArray are counted as grid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not neccessary. These doctests are added to understand the current behavior of the data_kind function.

I feel 1-D xarray.DataArray should be considered as a vector and 4-D xarray.DataArray should not be recognized and we should raise an exception. Will refactor the data_kind function in a separate PR.

Copy link
Member

@weiji14 weiji14 Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel 1-D xarray.DataArray should be considered as a vector

We already recognize an xr.Dataset composed of 1-D xarray.DataArrays as a matrix (can be treated similarly as multiple 'columns' in a Dataframe). I don't think we need to expand this logic, since they users can call .data on a 1-D xarray.DataArray to get a numpy.ndarray that can then be passed to the data parameter.

4-D xarray.DataArray should not be recognized and we should raise an exception

Logic could be something like if dataarray.ndim() >= 4: raise Error, but I don't think this is really necessary since an error would be raised by the PyGMT module anyway right?

pygmt/tests/test_grdtrack.py Outdated Show resolved Hide resolved
pygmt/tests/test_blockm.py Show resolved Hide resolved
Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-approving since most of this looks good. There's some discussion about whether to check 1-D/4-D xarray.DataArrays at #3480 (comment), but not very critical and can be handled later with the actual code refactor.

@seisman seisman merged commit 68a17a0 into main Oct 3, 2024
5 checks passed
@seisman seisman deleted the data_kind/doctest branch October 3, 2024 06:03
@seisman seisman removed the needs review This PR has higher priority and needs review. label Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs skip-changelog Skip adding Pull Request to changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants