Skip to content

Commit f776190

Browse files
max-sixtyClaude
andcommitted
feat: Preserve attributes by default in all operations
This change modifies xarray's default behavior to preserve attributes (`attrs`) across all operations, including computational, binary, and data manipulation functions. Previously, attributes were dropped by default unless `keep_attrs=True` was explicitly set. This new default aligns xarray with common scientific workflows where metadata preservation is crucial. The `keep_attrs` option now defaults to `True` for most operations. For binary operations, attributes are preserved from the left-hand operand. Users can revert to the previous behavior (dropping attributes) by: - Setting `xr.set_options(keep_attrs=False)` globally. - Passing `keep_attrs=False` to specific operations. - Using `with xr.set_options(keep_attrs=False):` for a code block. - Calling `.drop_attrs()` after an operation. This is a breaking change for users who relied on attributes being dropped by default. Co-authored-by: Claude <[email protected]>
1 parent 40c27d1 commit f776190

18 files changed

+341
-128
lines changed

doc/whats-new.rst

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,90 @@ New Features
1717
Breaking changes
1818
~~~~~~~~~~~~~~~~
1919

20+
- **All xarray operations now preserve attributes by default** (:issue:`3891`, :issue:`2582`, :issue:`7012`).
21+
Previously, operations would drop attributes unless explicitly told to preserve them via ``keep_attrs=True``.
22+
This aligns xarray with the common scientific workflow where metadata preservation is essential.
23+
24+
**What changed:**
25+
26+
.. code-block:: python
27+
28+
# Before (xarray <2025.09.1):
29+
data = xr.DataArray([1, 2, 3], attrs={"units": "meters", "long_name": "height"})
30+
result = data.mean()
31+
result.attrs # {} - Attributes lost!
32+
33+
# After (xarray ≥2025.09.1):
34+
data = xr.DataArray([1, 2, 3], attrs={"units": "meters", "long_name": "height"})
35+
result = data.mean()
36+
result.attrs # {"units": "meters", "long_name": "height"} - Attributes preserved!
37+
38+
**Affected operations include:**
39+
40+
*Computational operations:*
41+
42+
- Reductions: ``mean()``, ``sum()``, ``std()``, ``var()``, ``min()``, ``max()``, ``median()``, ``quantile()``, etc.
43+
- Rolling windows: ``rolling().mean()``, ``rolling().sum()``, etc.
44+
- Groupby: ``groupby().mean()``, ``groupby().sum()``, etc.
45+
- Resampling: ``resample().mean()``, etc.
46+
- Weighted: ``weighted().mean()``, ``weighted().sum()``, etc.
47+
- ``apply_ufunc()`` and NumPy universal functions
48+
49+
*Binary operations:*
50+
51+
- Arithmetic: ``+``, ``-``, ``*``, ``/``, ``**``, ``//``, ``%`` (attributes from left operand)
52+
- Comparisons: ``<``, ``>``, ``==``, ``!=``, ``<=``, ``>=`` (attributes from left operand)
53+
- With scalars: ``data * 2``, ``10 - data`` (preserves data's attributes)
54+
55+
*Data manipulation:*
56+
57+
- Missing data: ``fillna()``, ``dropna()``, ``interpolate_na()``, ``ffill()``, ``bfill()``
58+
- Indexing/selection: ``isel()``, ``sel()``, ``where()``, ``clip()``
59+
- Alignment: ``interp()``, ``reindex()``, ``align()``
60+
- Transformations: ``map()``, ``pipe()``, ``assign()``, ``assign_coords()``
61+
- Shape operations: ``expand_dims()``, ``squeeze()``, ``transpose()``, ``stack()``, ``unstack()``
62+
63+
**Binary operations - attributes from left operand:**
64+
65+
.. code-block:: python
66+
67+
a = xr.DataArray([1, 2], attrs={"source": "sensor_a"})
68+
b = xr.DataArray([3, 4], attrs={"source": "sensor_b"})
69+
(a + b).attrs # {"source": "sensor_a"} - Left operand wins
70+
(b + a).attrs # {"source": "sensor_b"} - Order matters!
71+
72+
**How to restore previous behavior:**
73+
74+
1. **Globally for your entire script:**
75+
76+
.. code-block:: python
77+
78+
import xarray as xr
79+
80+
xr.set_options(keep_attrs=False) # Affects all subsequent operations
81+
82+
2. **For specific operations:**
83+
84+
.. code-block:: python
85+
86+
result = data.mean(dim="time", keep_attrs=False)
87+
88+
3. **For code blocks:**
89+
90+
.. code-block:: python
91+
92+
with xr.set_options(keep_attrs=False):
93+
# All operations in this block drop attrs
94+
result = data1 + data2
95+
96+
4. **Remove attributes after operations:**
97+
98+
.. code-block:: python
99+
100+
result = data.mean().drop_attrs()
101+
102+
By `Maximilian Roos <https://github.com/max-sixty>`_.
103+
20104
- :py:meth:`Dataset.update` now returns ``None``, instead of the updated dataset. This
21105
completes the deprecation cycle started in version 0.17. The method still updates the
22106
dataset in-place. (:issue:`10167`)

xarray/computation/apply_ufunc.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1214,7 +1214,7 @@ def apply_ufunc(
12141214
func = functools.partial(func, **kwargs)
12151215

12161216
if keep_attrs is None:
1217-
keep_attrs = _get_keep_attrs(default=False)
1217+
keep_attrs = _get_keep_attrs(default=True)
12181218

12191219
if isinstance(keep_attrs, bool):
12201220
keep_attrs = "override" if keep_attrs else "drop"

xarray/computation/computation.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -726,7 +726,7 @@ def where(cond, x, y, keep_attrs=None):
726726
from xarray.core.dataset import Dataset
727727

728728
if keep_attrs is None:
729-
keep_attrs = _get_keep_attrs(default=False)
729+
keep_attrs = _get_keep_attrs(default=True)
730730

731731
# alignment for three arguments is complicated, so don't support it yet
732732
from xarray.computation.apply_ufunc import apply_ufunc

xarray/computation/weighted.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -448,7 +448,6 @@ def _weighted_quantile_1d(
448448

449449
result = result.transpose("quantile", ...)
450450
result = result.assign_coords(quantile=q).squeeze()
451-
452451
return result
453452

454453
def _implementation(self, func, dim, **kwargs):
@@ -551,7 +550,6 @@ def _implementation(self, func, dim, **kwargs) -> DataArray:
551550
class DatasetWeighted(Weighted["Dataset"]):
552551
def _implementation(self, func, dim, **kwargs) -> Dataset:
553552
self._check_dim(dim)
554-
555553
return self.obj.map(func, dim=dim, **kwargs)
556554

557555

xarray/core/common.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1314,7 +1314,7 @@ def isnull(self, keep_attrs: bool | None = None) -> Self:
13141314
from xarray.computation.apply_ufunc import apply_ufunc
13151315

13161316
if keep_attrs is None:
1317-
keep_attrs = _get_keep_attrs(default=False)
1317+
keep_attrs = _get_keep_attrs(default=True)
13181318

13191319
return apply_ufunc(
13201320
duck_array_ops.isnull,
@@ -1357,7 +1357,7 @@ def notnull(self, keep_attrs: bool | None = None) -> Self:
13571357
from xarray.computation.apply_ufunc import apply_ufunc
13581358

13591359
if keep_attrs is None:
1360-
keep_attrs = _get_keep_attrs(default=False)
1360+
keep_attrs = _get_keep_attrs(default=True)
13611361

13621362
return apply_ufunc(
13631363
duck_array_ops.notnull,

xarray/core/dataarray.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3889,8 +3889,8 @@ def reduce(
38893889
supplied, then the reduction is calculated over the flattened array
38903890
(by calling `f(x)` without an axis argument).
38913891
keep_attrs : bool or None, optional
3892-
If True, the variable's attributes (`attrs`) will be copied from
3893-
the original object to the new one. If False (default), the new
3892+
If True (default), the variable's attributes (`attrs`) will be copied from
3893+
the original object to the new one. If False, the new
38943894
object will be returned without attributes.
38953895
keepdims : bool, default: False
38963896
If True, the dimensions which are reduced are left in the result

xarray/core/dataset.py

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6773,8 +6773,8 @@ def reduce(
67736773
Dimension(s) over which to apply `func`. By default `func` is
67746774
applied over all dimensions.
67756775
keep_attrs : bool or None, optional
6776-
If True, the dataset's attributes (`attrs`) will be copied from
6777-
the original object to the new one. If False (default), the new
6776+
If True (default), the dataset's attributes (`attrs`) will be copied from
6777+
the original object to the new one. If False, the new
67786778
object will be returned without attributes.
67796779
keepdims : bool, default: False
67806780
If True, the dimensions which are reduced are left in the result
@@ -6832,7 +6832,7 @@ def reduce(
68326832
dims = parse_dims_as_set(dim, set(self._dims.keys()))
68336833

68346834
if keep_attrs is None:
6835-
keep_attrs = _get_keep_attrs(default=False)
6835+
keep_attrs = _get_keep_attrs(default=True)
68366836

68376837
variables: dict[Hashable, Variable] = {}
68386838
for name, var in self._variables.items():
@@ -6924,14 +6924,16 @@ def map(
69246924
bar (x) float64 16B 1.0 2.0
69256925
"""
69266926
if keep_attrs is None:
6927-
keep_attrs = _get_keep_attrs(default=False)
6927+
keep_attrs = _get_keep_attrs(default=True)
69286928
variables = {
69296929
k: maybe_wrap_array(v, func(v, *args, **kwargs))
69306930
for k, v in self.data_vars.items()
69316931
}
6932-
if keep_attrs:
6933-
for k, v in variables.items():
6932+
for k, v in variables.items():
6933+
if keep_attrs:
69346934
v._copy_attrs_from(self.data_vars[k])
6935+
else:
6936+
v.attrs = {}
69356937
attrs = self.attrs if keep_attrs else None
69366938
return type(self)(variables, attrs=attrs)
69376939

@@ -7658,7 +7660,7 @@ def _binary_op(self, other, f, reflexive=False, join=None) -> Dataset:
76587660
self, other = align(self, other, join=align_type, copy=False)
76597661
g = f if not reflexive else lambda x, y: f(y, x)
76607662
ds = self._calculate_binary_op(g, other, join=align_type)
7661-
keep_attrs = _get_keep_attrs(default=False)
7663+
keep_attrs = _get_keep_attrs(default=True)
76627664
if keep_attrs:
76637665
ds.attrs = self.attrs
76647666
return ds
@@ -8254,7 +8256,7 @@ def quantile(
82548256
coord_names = {k for k in self.coords if k in variables}
82558257
indexes = {k: v for k, v in self._indexes.items() if k in variables}
82568258
if keep_attrs is None:
8257-
keep_attrs = _get_keep_attrs(default=False)
8259+
keep_attrs = _get_keep_attrs(default=True)
82588260
attrs = self.attrs if keep_attrs else None
82598261
new = self._replace_with_new_dims(
82608262
variables, coord_names=coord_names, attrs=attrs, indexes=indexes
@@ -8316,7 +8318,7 @@ def rank(
83168318

83178319
coord_names = set(self.coords)
83188320
if keep_attrs is None:
8319-
keep_attrs = _get_keep_attrs(default=False)
8321+
keep_attrs = _get_keep_attrs(default=True)
83208322
attrs = self.attrs if keep_attrs else None
83218323
return self._replace(variables, coord_names, attrs=attrs)
83228324

xarray/core/datatree.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -428,7 +428,7 @@ def map( # type: ignore[override]
428428
# Copied from xarray.Dataset so as not to call type(self), which causes problems (see https://github.com/xarray-contrib/datatree/issues/188).
429429
# TODO Refactor xarray upstream to avoid needing to overwrite this.
430430
if keep_attrs is None:
431-
keep_attrs = _get_keep_attrs(default=False)
431+
keep_attrs = _get_keep_attrs(default=True)
432432
variables = {
433433
k: maybe_wrap_array(v, func(v, *args, **kwargs))
434434
for k, v in self.data_vars.items()

xarray/core/variable.py

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1741,8 +1741,8 @@ def reduce( # type: ignore[override]
17411741
the reduction is calculated over the flattened array (by calling
17421742
`func(x)` without an axis argument).
17431743
keep_attrs : bool, optional
1744-
If True, the variable's attributes (`attrs`) will be copied from
1745-
the original object to the new one. If False (default), the new
1744+
If True (default), the variable's attributes (`attrs`) will be copied from
1745+
the original object to the new one. If False, the new
17461746
object will be returned without attributes.
17471747
keepdims : bool, default: False
17481748
If True, the dimensions which are reduced are left in the result
@@ -1757,7 +1757,7 @@ def reduce( # type: ignore[override]
17571757
removed.
17581758
"""
17591759
keep_attrs_ = (
1760-
_get_keep_attrs(default=False) if keep_attrs is None else keep_attrs
1760+
_get_keep_attrs(default=True) if keep_attrs is None else keep_attrs
17611761
)
17621762

17631763
# Note that the call order for Variable.mean is
@@ -2009,7 +2009,7 @@ def quantile(
20092009
_quantile_func = duck_array_ops.quantile
20102010

20112011
if keep_attrs is None:
2012-
keep_attrs = _get_keep_attrs(default=False)
2012+
keep_attrs = _get_keep_attrs(default=True)
20132013

20142014
scalar = utils.is_scalar(q)
20152015
q = np.atleast_1d(np.asarray(q, dtype=np.float64))
@@ -2350,7 +2350,7 @@ def isnull(self, keep_attrs: bool | None = None):
23502350
from xarray.computation.apply_ufunc import apply_ufunc
23512351

23522352
if keep_attrs is None:
2353-
keep_attrs = _get_keep_attrs(default=False)
2353+
keep_attrs = _get_keep_attrs(default=True)
23542354

23552355
return apply_ufunc(
23562356
duck_array_ops.isnull,
@@ -2384,7 +2384,7 @@ def notnull(self, keep_attrs: bool | None = None):
23842384
from xarray.computation.apply_ufunc import apply_ufunc
23852385

23862386
if keep_attrs is None:
2387-
keep_attrs = _get_keep_attrs(default=False)
2387+
keep_attrs = _get_keep_attrs(default=True)
23882388

23892389
return apply_ufunc(
23902390
duck_array_ops.notnull,
@@ -2435,7 +2435,7 @@ def _binary_op(self, other, f, reflexive=False):
24352435
other_data, self_data, dims = _broadcast_compat_data(other, self)
24362436
else:
24372437
self_data, other_data, dims = _broadcast_compat_data(self, other)
2438-
keep_attrs = _get_keep_attrs(default=False)
2438+
keep_attrs = _get_keep_attrs(default=True)
24392439
attrs = self._attrs if keep_attrs else None
24402440
with np.errstate(all="ignore"):
24412441
new_data = (
@@ -2526,7 +2526,9 @@ def _unravel_argminmax(
25262526
}
25272527

25282528
if keep_attrs is None:
2529-
keep_attrs = _get_keep_attrs(default=False)
2529+
keep_attrs = _get_keep_attrs(
2530+
default=True
2531+
) # Default now keeps attrs for reduction operations
25302532
if keep_attrs:
25312533
for v in result.values():
25322534
v.attrs = self.attrs

xarray/tests/test_computation.py

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -694,10 +694,8 @@ def test_broadcast_compat_data_2d() -> None:
694694

695695
def test_keep_attrs() -> None:
696696
def add(a, b, keep_attrs):
697-
if keep_attrs:
698-
return apply_ufunc(operator.add, a, b, keep_attrs=keep_attrs)
699-
else:
700-
return apply_ufunc(operator.add, a, b)
697+
# Always explicitly pass keep_attrs to test the specific behavior
698+
return apply_ufunc(operator.add, a, b, keep_attrs=keep_attrs)
701699

702700
a = xr.DataArray([0, 1], [("x", [0, 1])])
703701
a.attrs["attr"] = "da"
@@ -733,7 +731,7 @@ def add(a, b, keep_attrs):
733731
pytest.param(
734732
None,
735733
[{"a": 1}, {"a": 2}, {"a": 3}],
736-
{},
734+
{"a": 1}, # apply_ufunc now keeps attrs by default
737735
False,
738736
id="default",
739737
),
@@ -802,7 +800,7 @@ def test_keep_attrs_strategies_variable(strategy, attrs, expected, error) -> Non
802800
pytest.param(
803801
None,
804802
[{"a": 1}, {"a": 2}, {"a": 3}],
805-
{},
803+
{"a": 1}, # apply_ufunc now keeps attrs by default
806804
False,
807805
id="default",
808806
),
@@ -872,7 +870,7 @@ def test_keep_attrs_strategies_dataarray(strategy, attrs, expected, error) -> No
872870
pytest.param(
873871
None,
874872
[{"a": 1}, {"a": 2}, {"a": 3}],
875-
{},
873+
{"a": 1}, # apply_ufunc now keeps attrs by default
876874
False,
877875
id="default",
878876
),
@@ -967,7 +965,7 @@ def test_keep_attrs_strategies_dataarray_variables(
967965
pytest.param(
968966
None,
969967
[{"a": 1}, {"a": 2}, {"a": 3}],
970-
{},
968+
{"a": 1}, # apply_ufunc now keeps attrs by default
971969
False,
972970
id="default",
973971
),
@@ -1037,7 +1035,7 @@ def test_keep_attrs_strategies_dataset(strategy, attrs, expected, error) -> None
10371035
pytest.param(
10381036
None,
10391037
[{"a": 1}, {"a": 2}, {"a": 3}],
1040-
{},
1038+
{"a": 1}, # apply_ufunc now keeps attrs by default
10411039
False,
10421040
id="default",
10431041
),

0 commit comments

Comments
 (0)