-
Notifications
You must be signed in to change notification settings - Fork 3
Added class MaskWithRivers #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces the MaskWithRivers class, which extends the base Mask class to handle river cells separately from sea cells. The implementation stores river cell positions in sparse arrays and provides a get_water_cells() method that returns all water cells (both sea and rivers combined). Additionally, the PR includes several grammar corrections in existing docstrings and comments.
- Adds
MaskWithRiversclass with river cell tracking using sparse arrays - Implements
from_file()andfrom_file_pointer()class methods for loading from NetCDF files - Includes grammar/spelling corrections in existing documentation
- Adds test for loading and verifying
MaskWithRiversfunctionality
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
src/bitsea/commons/mask.py |
Adds MaskWithRivers class implementation with river cell handling; corrects grammar in comments ("flatten" → "flattened", "takes in account" → "takes into account", etc.) |
tests/commons/test_mask.py |
Adds test for MaskWithRivers.from_file() that verifies river cells are properly separated from sea cells and can be recombined |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_mask_with_rivers_from_file(test_data_dir): | ||
| mask_dir = test_data_dir / "masks" | ||
| mask_file = mask_dir / "mask_with_rivers.nc" | ||
|
|
||
| rivers_mask = MaskWithRivers.from_file(mask_file) | ||
| water_cells = rivers_mask.get_water_cells() | ||
|
|
||
| with netCDF4.Dataset(mask_file, "r") as ds: | ||
| original_mask = np.asarray(ds.variables["tmask"]) > 0.5 | ||
| rivers = np.asarray(ds.variables["rivers"]) | ||
|
|
||
| assert np.all(original_mask == water_cells) | ||
|
|
||
| # Check that in the current mask every cell assigned to a river is set | ||
| # to "False" | ||
| for i, j in zip(*np.where(rivers)): | ||
| assert not np.any(rivers_mask[:, i, j]) | ||
|
|
||
| # Assert that if a value inside water_cells is True, then it is also True | ||
| # inside rivers_mask (rivers_mask is contained inside water_cells) | ||
| assert np.all( | ||
| np.logical_or(np.logical_and(rivers_mask, water_cells), ~rivers_mask) | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no test coverage for the copy() method with MaskWithRivers instances. Since the base Mask class has a test_mask_copy test, and given that MaskWithRivers has additional state (_river_cells) that needs to be properly copied, a test should be added to verify that copying a MaskWithRivers instance works correctly and preserves the river information.
| if not found_cells: | ||
| break |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The early break when no river cells are found at a depth level assumes that rivers don't exist at deeper levels if they don't exist at a shallower level. This assumption may not hold if different rivers have different depths or if there are gaps in the vertical distribution of river cells. Consider processing all depth levels instead of breaking early, or document this assumption clearly if it's intentional.
| if not found_cells: | |
| break |
| assert np.allclose(xarray_mask.depth, mask.zlevels) | ||
| assert np.allclose(xarray_mask.tmask, mask) | ||
|
|
||
|
|
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test function is missing the @pytest.mark.uses_test_data decorator. This decorator is needed for all tests that use the test_data_dir fixture (as shown in other tests like test_mask_from_file and test_mask_from_file_when_regular). Without this marker, the test will fail rather than be skipped when test data is unavailable.
| @pytest.mark.uses_test_data |
| return float(output.squeeze().item()) | ||
|
|
||
|
|
||
| class MaskWithRivers(Mask): |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MaskWithRivers class is missing a docstring. As a public class that extends Mask, it should have documentation explaining its purpose, parameters, and how it differs from the base Mask class (specifically how it handles river cells separately from sea cells).
| class MaskWithRivers(Mask): | |
| class MaskWithRivers(Mask): | |
| """ | |
| A subclass of Mask that distinguishes river cells from sea and land cells. | |
| This class extends the base Mask functionality by allowing certain cells to be | |
| marked as rivers, in addition to the standard sea and land classification. | |
| River cells are handled separately from sea cells, which is useful for models | |
| or analyses that require explicit treatment of riverine regions. | |
| Parameters | |
| ---------- | |
| grid : Grid | |
| The grid object describing the spatial layout of the mask. | |
| zlevels : ArrayLike | |
| The vertical levels of the grid. | |
| mask_array : ArrayLike | |
| The mask array indicating sea (typically 1) and land (typically 0) cells. | |
| river_positions : ArrayLike | |
| An array indicating the positions of river cells within the grid. | |
| allow_broadcast : bool, optional | |
| Whether to allow broadcasting of the mask array to match the grid shape. | |
| e3t : Optional[np.ndarray], optional | |
| Optional array of vertical cell thicknesses. | |
| Notes | |
| ----- | |
| Unlike the base Mask class, MaskWithRivers treats river cells as a separate | |
| category from both sea and land. This allows for specialized handling of | |
| riverine processes in downstream applications. | |
| """ |
| river_positions: ArrayLike, | ||
| allow_broadcast: bool = False, | ||
| e3t: Optional[np.ndarray] = None, | ||
| ): |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The init method lacks a docstring documenting its parameters. Given the complexity of this constructor and the specific role of the river_positions parameter, documentation is essential for users to understand how to properly instantiate this class.
| ): | |
| ): | |
| """ | |
| Initialize a MaskWithRivers object. | |
| Parameters | |
| ---------- | |
| grid : Grid | |
| The grid object defining the spatial domain. | |
| zlevels : ArrayLike | |
| The vertical levels of the mask (e.g., depth layers). | |
| mask_array : ArrayLike | |
| A boolean array indicating the presence of water (True) or land (False) | |
| at each grid cell and depth level. Shape should be (len(zlevels), grid.shape[0], grid.shape[1]). | |
| river_positions : ArrayLike | |
| An array or sparse matrix indicating the positions of river cells in the grid. | |
| Typically, this should be a 2D array (grid.shape[0], grid.shape[1]) where nonzero entries | |
| indicate river cells and their values may represent river indices or IDs. | |
| allow_broadcast : bool, optional | |
| If True, allows broadcasting of mask_array to match the required shape. Default is False. | |
| e3t : Optional[np.ndarray], optional | |
| Optional array specifying the thickness of each vertical layer. | |
| Notes | |
| ----- | |
| River cells are removed from the mask and stored separately for further processing. | |
| """ |
| class MaskWithRivers(Mask): | ||
| def __init__( | ||
| self, | ||
| grid: Grid, | ||
| zlevels: ArrayLike, | ||
| mask_array: ArrayLike, | ||
| river_positions: ArrayLike, | ||
| allow_broadcast: bool = False, | ||
| e3t: Optional[np.ndarray] = None, | ||
| ): | ||
| rivers = np.asarray(river_positions) | ||
|
|
||
| mask_dim = len(zlevels), grid.shape[0], grid.shape[1] | ||
|
|
||
| try: | ||
| mask_data = np.asarray(mask_array, dtype=bool, copy=False) | ||
| input_copied = False | ||
| except TypeError: | ||
| # Old versions of Numpy do not support the "copy=False" syntax; | ||
| # In this case, we call the function without specifying the value | ||
| # of the "copy" argument, and we must assume that the data has not | ||
| # been copied (for safety) | ||
| mask_data = np.asarray(mask_array, dtype=bool) | ||
| input_copied = False | ||
| except ValueError: | ||
| # In this case, it has been impossible to avoid the copy. We run | ||
| # again the same command, but this time we force the copy | ||
| mask_data = np.asarray(mask_array, dtype=bool) | ||
| input_copied = True | ||
|
|
||
| # If mask_data is too small to cover all the mask, we try to broadcast | ||
| # it. This enforces a copy | ||
| if allow_broadcast and mask_data.shape != mask_dim: | ||
| mask_data = np.copy(np.broadcast_to(mask_array, mask_dim)) | ||
| input_copied = True | ||
|
|
||
| # Ensure that mask_array is writeable | ||
| if not input_copied: | ||
| mask_data = np.copy(mask_data) | ||
|
|
||
| # Here we save the cells that belong to each river into a list of | ||
| # sparse arrays. We must use a list of 2D array because there are no | ||
| # 3D sparse arrays implemented inside scipy. | ||
| self._river_cells = [] | ||
| rivers = dok_array(rivers) | ||
| for depth_index in range(mask_data.shape[0]): | ||
| found_cells = False | ||
| current_river_cells = dok_array(mask_data.shape[1:], dtype=int) | ||
| for (i, j), value in rivers.items(): | ||
| if not mask_data[depth_index, i, j]: | ||
| continue | ||
| current_river_cells[i, j] = value | ||
| found_cells = True | ||
|
|
||
| if not found_cells: | ||
| break | ||
| self._river_cells.append(current_river_cells) | ||
|
|
||
| # Remove rivers from mask_array | ||
| for i, j in zip(*rivers.nonzero()): | ||
| mask_data[:, i, j] = False | ||
|
|
||
| super().__init__( | ||
| grid=grid, | ||
| zlevels=zlevels, | ||
| mask_array=mask_data, | ||
| allow_broadcast=allow_broadcast, | ||
| e3t=e3t, | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MaskWithRivers class should override the copy() method to properly handle the _river_cells attribute. The parent class's copy() method (which uses self.class) will fail because it doesn't pass the required river_positions parameter. This will cause a TypeError when attempting to copy a MaskWithRivers instance.
| def get_water_cells(self) -> np.ndarray: | ||
| output_data = np.copy(self[:]) | ||
| for depth_index, river_cells in enumerate(self._river_cells): | ||
| lat_indices, lon_indices = river_cells.nonzero() | ||
| output_data[depth_index, lat_indices, lon_indices] = True | ||
| return output_data |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_water_cells method lacks a docstring. This is particularly important because this method's behavior differs from the parent class's implementation - it returns the combined mask of both sea and river cells, which is critical for users to understand.
| grid=grid, | ||
| zlevels=zlevels, | ||
| mask_array=mask_data, | ||
| allow_broadcast=allow_broadcast, |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allow_broadcast parameter is passed to the parent's init method, but at this point the mask_data has already been processed and potentially broadcasted. This could lead to inconsistent behavior because the parent class won't perform any broadcasting (since mask_data already has the correct shape). Consider setting allow_broadcast=False when calling super().init() to make the intent clearer.
| allow_broadcast=allow_broadcast, | |
| allow_broadcast=False, |
This introduces a new class, named MaskWithRivers, that behaves like a mask but also knows which cells belong to the rivers.