Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clib.conversion: Remove the array_to_datetime function, which is no longer needed #3507
base: main
Are you sure you want to change the base?
clib.conversion: Remove the array_to_datetime function, which is no longer needed #3507
Changes from 2 commits
3661e54
20b9215
83673cf
56a0841
6338cde
1864556
54160bf
f4e1a5f
ed3be20
8ab6c6c
e26afbf
be3c93e
3d40687
8f99a97
5ecf445
cd75c5c
91e10a7
47214e5
d8777e5
3286f08
38b839a
76cb837
9573fb6
bccae94
d64c795
a410357
53288dc
765d6c3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pd.api.types.is_string_dtype
was added in PR #684. Before PR #684, we usednp.issubdtype(array.dtype, np.str_)
, which was added in #520. Any specific reason that we should usenp.issubdtype(array.dtype, np.str_)
, notarray.dtype.type == np.str_
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll need to check if this can handle
pandas.StringDtype
andpyarrow.StringArray
(xref #2933).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both can be converted to the numpy string dtype by the
vectors_to_arrays
method, so invirtualfile_from_vectors
, we only need to checknp.str_
:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main idea of this PR is to let
vectors_to_arrays
do the conversion work from any dtypes (includingpd.StringDtype
andpa.StringArray
) into the numpy dtypes, so that thevirtualfile_from_vectors
only need to care about how to map numpy dtypes into GMT data types.For any special dtypes that we know how to convert it to numpy dtype, we can maintain a mapping dictionary, just like what you did to support pyarrow's date32[day] and date64[ms] in #2845:
pygmt/pygmt/clib/conversion.py
Lines 208 to 211 in c2e429c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 83673cf, I've moved most of the doctests into a separate test file
pygmt/tests/test_clib_vectors_to_arrays.py
. A testtest_vectors_to_arrays_pandas_string
is added to check that the function can handlepd.StringDtype
correctly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the tests below, I think we should add the entry
"string": np.str_
to the dictionary:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In PR #2774 and #2774, we only checked if PyGMT supports pandas with the pyarrow backend, but didn't check if the original pyarrow arrays works. For example, for a pyarrow
date32
array, we need to checkarray.type
rather thanarray.dtype
:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, raw
pyarrow
arrays ofdate32/date64
type are not supported yet. That's why it's marked 🚧 in #2800 (I was planning to modifyarray_to_datetime
to handle it).