-
Notifications
You must be signed in to change notification settings - Fork 234
Support polygon-based subsetting in grdcut -F with GeoDataFrame #4135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Support polygon-based subsetting in grdcut -F with GeoDataFrame #4135
Conversation
|
@yvonnefroehlich I have added both of the Also, the label is |
weiji14
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kshitijaucharmal! Just a few comments for now. If you're able to add some unit tests to test_grdcut.py, that would be fantastic too!
- Checking for PathLike instead of str, bytes - Inverted logic and comments to make it clearer - Update gallery example: - removed "complex" from polygon desc. - improved docs - Switch CRS to OGC:CRS84
- Validated polygon input types, raising GMTValueError for unsupported types.
- Unit tests added for:
- Shapely Polygon input
- GeoDataFrame polygon input
- GMT ASCII polygon file input
- Crop (+c) and invert (+i) modifiers
- Invalid polygon input type to raise error
|
@weiji14 I have resolved all comments and added the unit tests as well, let me know if there are any more corrections! :) |
weiji14
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the unit tests! Here are some more comments, and apologies for the failing CI tests, we have some issues with DVC running on forks that we'll probably try to sort out next week.
pygmt/tests/test_grdcut.py
Outdated
| # Reusable polygon geometry | ||
| poly = Polygon( | ||
| [ | ||
| (-52.5, -19.5), | ||
| (-51.5, -19.0), | ||
| (-50.5, -18.5), | ||
| (-51.5, -18.0), | ||
| (-52.5, -19.5), | ||
| ] | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be turned into a pytest fixture, see examples a few lines below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will convert this into a fixture
| outgrid = grdcut(grid=grid, polygon=poly) | ||
| assert outgrid is not None | ||
| assert outgrid.rio.crs is None or str(outgrid.rio.crs) == "" # CRS not enforced | ||
| assert outgrid.size > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a more precise assertion(s)? Ideally, you want to check the exact pixel count, and maybe also check that there are NaN values in the four corners or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, for the polygon each vertex will need to not be NaN
pygmt/src/grdcut.py
Outdated
| polygon : str, geopandas.GeoDataFrame, or shapely.geometry | ||
| Extract a subregion using a polygon. Can be either: | ||
| - A GMT ASCII polygon file (`.gmt`) | ||
| - A geopandas.GeoDataFrame (must have CRS EPSG:4326) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if that specific CRS is necessary, of if a CRS is needed actually. Other input types (ASCII file, shapely.geometry.Polygon) don't even have a CRS.
| Extract a subregion using a polygon. Can be either: | ||
| - A GMT ASCII polygon file (`.gmt`) | ||
| - A geopandas.GeoDataFrame (must have CRS EPSG:4326) | ||
| - A shapely.geometry.Polygon or MultiPolygon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a unit test to confirm that MultiPolygon does work?
Yeah I was wondering why all the test keep failing |
- Reusable polygon fixture for grdcut tests - Improve assertions to check pixel counts and presence of NaNs outside polygons - Add a test for MultiPolygon support
|
I do not quite understand the precise assertions part, is checking for some pixels inside polygon to not be NaN enough? I hope the new cases are correct or at least in the right direction. Thanks ! |
|
Hey @weiji14 are the CI/CD issues not fixed yet? Please let me know if I can help in any way, thanks ! |
We still have issues with CI, but you can just go ahead to improve this PR. We will run the tests locally as a temporary workaround. |
|
Kindly ping @kshitijaucharmal to see if you have time to continue this PR. The CI issue is likely fixed. |
|
Hey @seisman sorry for the delay, will be fixing the issues in the PR today as soon as I can |
|
Have made the changes requested, please check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made some suggestions, mainly to migrate the polgyon parameter to the new alias system
|
Have made the changes, the new alias system is much better honestly :) |
| if isinstance(polygon, PathLike) or hasattr( | ||
| polygon, "__geo_interface__" | ||
| ): | ||
| aliasdict["F"] = str(vinpoly) + aliasdict.get("F", "") | ||
| else: | ||
| polygon_type = type(polygon).__name__ | ||
| err_msg = "Invalid polygon type" | ||
| raise GMTValueError( | ||
| err_msg, | ||
| polygon_type, | ||
| "Must be a PathLike, GeoDataFrame, or Shapely geometry.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check can be done by the data_kind function, e.g.,
if polyon is not None and data_kind(polygon) not in ("file", "geojson"):
raise GMTTypeError(...)
Ideally we should raise the error as early as possible, so please put it at line 140.
| if polygon is not None: | ||
| if isinstance(polygon, PathLike) or hasattr( | ||
| polygon, "__geo_interface__" | ||
| ): | ||
| aliasdict["F"] = str(vinpoly) + aliasdict.get("F", "") | ||
| else: | ||
| polygon_type = type(polygon).__name__ | ||
| err_msg = "Invalid polygon type" | ||
| raise GMTValueError( | ||
| err_msg, | ||
| polygon_type, | ||
| "Must be a PathLike, GeoDataFrame, or Shapely geometry.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my comment above. The check can be done earlier.
| if polygon is not None: | |
| if isinstance(polygon, PathLike) or hasattr( | |
| polygon, "__geo_interface__" | |
| ): | |
| aliasdict["F"] = str(vinpoly) + aliasdict.get("F", "") | |
| else: | |
| polygon_type = type(polygon).__name__ | |
| err_msg = "Invalid polygon type" | |
| raise GMTValueError( | |
| err_msg, | |
| polygon_type, | |
| "Must be a PathLike, GeoDataFrame, or Shapely geometry.", | |
| ) | |
| if polygon: | |
| aliasdict["F"] = str(vinpoly) + aliasdict.get("F", "") |
| build_arg_list, | ||
| data_kind, | ||
| fmt_docstring, | ||
| tempfile_from_geojson, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tempfile_from_geojson, |
| outkind = kind | ||
| case _: | ||
| raise GMTTypeError(type(grid)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if polygon and data_kind(polygon) not in ("file", "geojson"): | |
| raise GMTTypeError( | |
| type(polygon), | |
| reason="Must be a PathLike, GeoDataFrame, or Shapely geometry.", | |
| ) |
Description of proposed changes
This PR enhances the pygmt.grdcut function to support polygon-based subsetting of grids using the -F option. Users can now provide polygons in three different formats:
Additionally, two new boolean parameters,
cropandinvert, have been added to control how the polygon is applied:crop=True crops the output grid region to the bounding box of the polygon.invert=True inverts the selection, setting all nodes inside the polygon to NaN.A gallery example has been added to demonstrate all three polygon input methods and the effects of the crop and invert options, showing original, cropped, inverted, and cropped+inverted grids. This improves PyGMT’s functionality for extracting regions of interest from raster grids and makes the workflow more Pythonic and flexible.
Fixes #1529 (implements rather)
Preview: https://pygmt-dev--4135.org.readthedocs.build/en/4135/gallery/images/grdpolygon.html
Users can crop grids using polygons without manually creating ASCII files.
Example figure showing original grid and cropped and inverted grids
Reminders
make formatandmake checkto make sure the code follows the style guide.doc/api/index.rst.