dataextractor update for numpy 2.0 #98

gbolzon · 2025-12-05T13:26:32Z

No description provided.

Copilot

Pull request overview

This PR aims to update the codebase for numpy 2.0 compatibility, focusing on the DataExtractor class and related code in the layer integral module. However, the changes introduce critical bugs that will break existing functionality.

Key changes:

Simplified fill value detection in DataExtractor using get_fill_value() method
Refactored LayerMap class to inherit from Layer class
Code formatting improvements and reorganization of imports in mapbuilder.py

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File	Description
src/bitsea/commons/dataextractor.py	Simplified fill value detection to use netCDF4's `get_fill_value()` method with fallback to `_FillValue` attribute
src/bitsea/commons/layer.py	Refactored LayerMap to inherit from Layer, attempting to reduce code duplication
src/bitsea/layer_integral/mapbuilder.py	Code formatting improvements, reorganized imports, updated type checks, and mixed usage of np.array/np.asarray for numpy 2.0 compatibility

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-09T17:55:21Z

src/bitsea/commons/layer.py

+class LayerMap(Layer):
+    def __init__(self, mask, top: Real, bottom: Real):
+        super().__init__(top, bottom)
+        self.__mask = mask

    @property
    def mask(self):
        return self.__mask

    @property
    def dimension(self):
-        return self.__dim
+        return self.__mask.shape[1:]


The refactored LayerMap class now inherits from Layer, but this introduces a breaking change. The original LayerMap stored top and bottom as 2D numpy arrays (with shape matching mask.shape[1:]), while Layer expects scalar Real values.

The code in mapbuilder.py (lines 281, 284, 337, 343, etc.) accesses layer.top[jj, ii] and layer.bottom[jj, ii], expecting arrays. With the new implementation, layer.top and layer.bottom will be scalars (stored in Layer.__top and Layer.__bottom), causing this code to fail with indexing errors.

The original LayerMap.__init__ validated that top and bottom were arrays with the correct shape and ensured np.all(top <= bottom). The new implementation bypasses this validation since it calls super().__init__(top, bottom) which expects scalars.

Copilot · 2025-12-09T17:55:22Z

src/bitsea/layer_integral/mapbuilder.py

+        if not isinstance(layer, Layer):
            raise ValueError("layer must be a Layer object")


The type check has been changed from isinstance(layer, (Layer,LayerMap)) to isinstance(layer, Layer), which would now accept LayerMap instances since LayerMap inherits from Layer. However, the implementation that follows (lines 384-426) is specifically for Layer objects with scalar top and bottom values, not for LayerMap objects with array values.

The order of checks matters here: the code checks for LayerMap first (line 330), then checks for Layer (line 384). Since LayerMap is now a subclass of Layer, an instance of LayerMap will match isinstance(layer, Layer) at line 323, but the check at line 330 will catch it first. However, the initial validation at line 323 is misleading because it suggests only Layer (not LayerMap) is accepted, while the function actually handles both types.

Suggested change

if not isinstance(layer, Layer):

raise ValueError("layer must be a Layer object")

if not isinstance(layer, (Layer, LayerMap)):

raise ValueError("layer must be a Layer or LayerMap object")

Copilot · 2025-12-09T17:55:22Z

src/bitsea/commons/dataextractor.py

+            try:
+                self.__dset_fillvalue = dset.variables[v].get_fill_value()
+            except AttributeError:
+                self.__dset_fillvalue = getattr(dset.variables[v], "_FillValue")


The fill value detection logic has been simplified to use get_fill_value() with a fallback to _FillValue attribute. However, this may not handle all the cases that the original code covered.

The original code checked multiple attribute names: "missing_value", "fillvalue", "fillValue", "FillValue". The new approach only uses get_fill_value() (which typically checks _FillValue, missing_value, and the variable's fill value in that order depending on the netCDF4 version) and falls back to _FillValue if get_fill_value() is not available.

This could miss cases where the fill value is stored under alternative attribute names like "fillvalue" (lowercase) or "fillValue" (camelCase), which may exist in some NetCDF files for backwards compatibility.

Suggested change

self.__dset_fillvalue = getattr(dset.variables[v], "_FillValue")

# Check for all common fill value attribute names

for attr_name in [

"_FillValue",

"missing_value",

"fillvalue",

"fillValue",

"FillValue",

]:

if hasattr(dset.variables[v], attr_name):

self.__dset_fillvalue = getattr(dset.variables[v], attr_name)

break

else:

# If none found, fallback to np.nan

self.__dset_fillvalue = np.nan

Copilot · 2025-12-09T17:55:22Z

src/bitsea/layer_integral/mapbuilder.py

+                dzm[j, :, :] = data_extractor._mask.dz[i]
+                j += 1
+            # Get the slice of the values
+            v = np.array(data_extractor.values[top_index:bottom_index, :, :])


Inconsistent use of np.array() vs np.asarray() within the same function. Line 369 uses np.asarray() for the LayerMap branch, while line 415 uses np.array() for the Layer branch when extracting values from the same source (data_extractor.values).

For consistency and to ensure proper numpy 2.0 compatibility, the same array conversion function should be used throughout. np.asarray() is generally preferred as it avoids unnecessary copies when the input is already an array.

Suggested change

v = np.array(data_extractor.values[top_index:bottom_index, :, :])

v = np.asarray(data_extractor.values[top_index:bottom_index, :, :])

Copilot · 2025-12-09T17:55:23Z

src/bitsea/layer_integral/mapbuilder.py

+            #        for f in self.__netcdffileslist:
+            #            for p in self.__plotlist:


This comment appears to contain commented-out code.

Suggested change

# for f in self.__netcdffileslist:

# for p in self.__plotlist:

Copilot · 2025-12-09T17:55:23Z

src/bitsea/commons/layer.py

-    @property
-    def bottom(self):
-        return self.__bottom
+class LayerMap(Layer):


The class 'LayerMap' does not override 'eq', but adds the new attribute __mask.

spiani added 3 commits December 5, 2025 12:12

LayerMap is now a subclass of Layer

ef4c313

Find fill_value with dedicated method

e49f5af

Add fallback method for get_fill_value

643c7f4

gbolzon self-assigned this Dec 5, 2025

spiani requested a review from Copilot December 9, 2025 17:50

Copilot started reviewing on behalf of spiani December 9, 2025 17:51 View session

Copilot AI reviewed Dec 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dataextractor update for numpy 2.0 #98

dataextractor update for numpy 2.0 #98

Uh oh!

gbolzon commented Dec 5, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if not isinstance(layer, Layer):
		raise ValueError("layer must be a Layer object")

-                self.__dset_fillvalue = getattr(dset.variables[v], "_FillValue")
+                # Check for all common fill value attribute names
+                for attr_name in [
+                    "_FillValue",
+                    "missing_value",
+                    "fillvalue",
+                    "fillValue",
+                    "FillValue",
+                ]:
+                    if hasattr(dset.variables[v], attr_name):
+                        self.__dset_fillvalue = getattr(dset.variables[v], attr_name)
+                        break
+                else:
+                    # If none found, fallback to np.nan
+                    self.__dset_fillvalue = np.nan

	v = np.array(data_extractor.values[top_index:bottom_index, :, :])
	v = np.asarray(data_extractor.values[top_index:bottom_index, :, :])

		# for f in self.__netcdffileslist:
		# for p in self.__plotlist:

dataextractor update for numpy 2.0 #98

Are you sure you want to change the base?

dataextractor update for numpy 2.0 #98

Uh oh!

Conversation

gbolzon commented Dec 5, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants