Add save_bboxes.to_netcdf() and load_bboxes.from_netcdf() for complete netCDF4 I/O support by manashk29 · Pull Request #153 · neuroinformatics-unit/ethology

manashk29 · 2026-04-09T19:59:58Z

What is this PR

Bug fix
Addition of a new feature
Other

Adds two new public functions to complete the netCDF4 I/O round-trip
for ethology bounding box annotation datasets.

##save_bboxes.to_netcdf()

What it does:
Wraps xarray.Dataset.to_netcdf() with automatic handling of all
attribute types that netCDF4 cannot store natively.

The problem it solves:
The ethology Dataset has three dict attributes
(map_category_to_str, map_image_id_to_filename,
map_image_id_to_original_coco_id), a list attribute
(annotation_files), and a None attribute (images_directories).
netCDF4 cannot store any of these types — calling ds.to_netcdf()
directly would either silently corrupt them or raise a TypeError.

How it works:

Makes a deep copy of the Dataset so the caller's attrs are never mutated
Iterates over all attrs and converts:
- dict → JSON string via json.dumps()
- list → JSON string (each element converted to str first)
- Path → plain string
- None → dropped entirely
- str, int, float → kept as-is (already netCDF4-safe)
Calls xarray.Dataset.to_netcdf() on the cleaned copy

##load_bboxes.from_netcdf()

What it does:
Wraps xr.open_dataset() with automatic deserialisation of all
attributes back to their original Python types, then validates
the loaded Dataset.

The problem it solves:
After loading a netCDF4 file with raw xr.open_dataset(),
map_category_to_str comes back as the string
'{"1": "Mallard", "3": "Goose"}' instead of the dict
{1: "Mallard", 3: "Goose"}. The keys are also strings ("1", "3")
not integers (1, 3) because json.dumps() always serialises dict
keys as strings. Any downstream code calling
ds.attrs["map_category_to_str"][1] would fail with a KeyError.

How it works:

Raises FileNotFoundError with a clear message if the file does not exist
Calls xr.open_dataset(file_path).load() — the .load() call
reads all data eagerly into memory and closes the file handle
immediately, preventing PermissionError on Windows
Iterates over the three dict attrs and for each:
- Calls json.loads() to parse the JSON string back to a dict
- Converts all numeric string keys ("1", "3") back to integers
  (1, 3) using int(k) if k.lstrip("-").isdigit() else k
Restores annotation_files from JSON string back to a list of
Path objects
Uses @_check_output(ValidBboxAnnotationsDataset) decorator —
same as from_files() — so the returned Dataset is guaranteed
to be valid

Tests

Added tests/test_unit/test_io_annotations/test_io_netcdf.py with 3 test classes

Checklist:

The code has been tested locally
Tests have been added to cover all new functionality
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

for more information, see https://pre-commit.ci

manashk29 and others added 5 commits April 10, 2026 00:20

Add to_netcdf() and from_netcdf() for complete netCDF4 I/O support

617b70d

Add to_netcdf() and from_netcdf() for complete netCDF4 I/O support

f3da97d

NetCDF IO implementation and fixes

f2ddcd8

Remove accidental less output file

7eb8c0f

[pre-commit.ci] auto fixes from pre-commit.com hooks

95d2fdd

for more information, see https://pre-commit.ci

manashk29 marked this pull request as ready for review April 9, 2026 20:00

This comment was marked as resolved.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add save_bboxes.to_netcdf() and load_bboxes.from_netcdf() for complete netCDF4 I/O support#153

Add save_bboxes.to_netcdf() and load_bboxes.from_netcdf() for complete netCDF4 I/O support#153
manashk29 wants to merge 5 commits into
neuroinformatics-unit:mainfrom
manashk29:feat/netcdf-io-wrappers

manashk29 commented Apr 9, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

manashk29 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests

Checklist:

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

manashk29 commented Apr 9, 2026 •

edited

Loading