Change default netCDF engine to use h5netcdf and add netcdf_engine_order #10755

shoyer · 2025-09-16T02:43:12Z

The default engine when reading/writing netCDF files is now h5netcdf or scipy, which are typically faster than the prior default of netCDF4-python. You can control this default behavior explicitly via the new netcdf_engine_order parameter in set_options(), e.g., xr.set_options(netcdf_engine_order=['netcdf4', 'scipy', 'h5netcdf']) to restore the prior defaults.

I've also updated the documentation page which misled @lesserwhirls about Xarray supporting invalid netCDF files without invalid_netcdf=True.

Closes Should Xarray prefer h5netcdf and scipy to netCDF4? #10657
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

@lesserwhirls

The default `engine` when reading/writing netCDF files is now h5netcdf or scipy, which are typically faster than the prior default of netCDF4-python. You can control this default behavior explicitly via the new `netcdf_engine_order` parameter in `set_options()`, e.g., `xr.set_options(netcdf_engine_order=['netcdf4', 'scipy', 'h5netcdf'])` to restore the prior defaults. I've also updated the documentation page which misled @lesserwhirls about Xarray supporting invalid netCDF files without `invalid_netcdf=True`. Fixes pydata#10657

shoyer · 2025-09-16T21:31:02Z

Looking at the test failures, it looks like we previously supported writing NCZarr with ds.to_netcdf(f"file://{filename}#mode=nczarr"). Now we require also passing engine='netcdf4' explicitly.

Should we try to auto-detect URLs like this and use netcdf4 as the backend? Or is it better to encourage users to make an explicit choice?

dcherian · 2025-09-16T22:42:24Z

in general I'm pro "explicit choice", but this would be a breaking change.

@malmans2 how common is nczarr use? I haven't really seen it.

shoyer · 2025-09-17T00:13:49Z

I went ahead and added automatic support for writing nczarr. This wasn't hard to check.

This reverts commit 18fe84f.

This reverts commit 4131449.

malmans2 · 2025-09-17T06:45:53Z

in general I'm pro "explicit choice", but this would be a breaking change.

@malmans2 how common is nczarr use? I haven't really seen it.

I've never seen it actually used in python applications either. From a quick search on GitHub, it looks like the few packages that write to nczarr directly use netcdf4-python rather than xarray

shoyer · 2025-09-17T07:05:38Z

I added supports_groups to BackendEntrypoint. Otherwise, we have no way to check if a backend supports open_datatree() short of calling the open_datatree() method.

This turned up because scipy is now used in preference to netcdf4 when opening netcdf v3 files, but scipy doesn't support opening groups.

In principle we could add support for reading groups to the SciPy backend (netcdf3 files arguably contain a single group, at the root node), but in any case this will also come up for custom backends.

shoyer · 2025-09-23T21:25:25Z

I would love to get this in before the next release, to avoid needing repeated breaking changes.

kmuehlbauer

LGTM, Stephan. Nice to be able to parametrize this.

github-actions bot added topic-backends topic-DataTree Related to the implementation of a DataTree class io labels Sep 16, 2025

shoyer changed the title ~~Add option for netcdf_engine_order~~ Change default netCDF engine to use h5netcdf and add netcdf_engine_order Sep 16, 2025

shoyer mentioned this pull request Sep 16, 2025

Should Xarray prefer h5netcdf and scipy to netCDF4? #10657

Open

shoyer added 2 commits September 15, 2025 20:02

Merge branch 'main' into netcdf_engine_order

ea8ef94

Fix test failures

c6eb82d

Merge branch 'main' into netcdf_engine_order

6d425db

Automatically support NCZarr

18fe84f

shoyer added 6 commits September 16, 2025 17:14

Revert "Automatically support NCZarr"

4131449

This reverts commit 18fe84f.

Reapply "Automatically support NCZarr"

913cded

This reverts commit 4131449.

Fix mypy

a397f1f

spelling

6a86d3b

Improve typing for _normalize_path()

e48ab59

hard code engine="netcdf4" for test_encoding_enum__no_fill_value

fdc7efb

shoyer added 2 commits September 16, 2025 23:57

Fix reading netcdf3 files with open_datatree

db8ec22

Set engine in test_encoding_enum__multiple_variable_with_enum

68bc5f8

shoyer added 2 commits September 17, 2025 00:12

set yet another test to only use netcdf4

b5ac76d

Merge branch 'main' into netcdf_engine_order

12e1fcd

kmuehlbauer approved these changes Sep 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Change default netCDF engine to use h5netcdf and add netcdf_engine_order #10755

Change default netCDF engine to use h5netcdf and add netcdf_engine_order #10755

Uh oh!

shoyer commented Sep 16, 2025

Uh oh!

shoyer commented Sep 16, 2025

Uh oh!

dcherian commented Sep 16, 2025

Uh oh!

shoyer commented Sep 17, 2025

Uh oh!

malmans2 commented Sep 17, 2025

Uh oh!

shoyer commented Sep 17, 2025

Uh oh!

shoyer commented Sep 23, 2025

Uh oh!

kmuehlbauer left a comment

Uh oh!

Uh oh!

Uh oh!

Change default netCDF engine to use h5netcdf and add netcdf_engine_order #10755

Are you sure you want to change the base?

Change default netCDF engine to use h5netcdf and add netcdf_engine_order #10755

Uh oh!

Conversation

shoyer commented Sep 16, 2025

Uh oh!

shoyer commented Sep 16, 2025

Uh oh!

dcherian commented Sep 16, 2025

Uh oh!

shoyer commented Sep 17, 2025

Uh oh!

malmans2 commented Sep 17, 2025

Uh oh!

shoyer commented Sep 17, 2025

Uh oh!

shoyer commented Sep 23, 2025

Uh oh!

kmuehlbauer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!