Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinate data is copied when initializing DataArray #9910

Open
5 tasks done
ckutlu opened this issue Dec 19, 2024 · 1 comment
Open
5 tasks done

Coordinate data is copied when initializing DataArray #9910

ckutlu opened this issue Dec 19, 2024 · 1 comment
Labels
bug needs triage Issue that has not been reviewed by xarray team member

Comments

@ckutlu
Copy link

ckutlu commented Dec 19, 2024

What happened?

Hi there! I am designing an internal library where I want to provide some utilities with some guarantees on the amount of copies are happening.

I noticed that if I do:

darr_coord = xarray.DataArray(some_array, dims="x")
darr = xr.DataArray(another_array, coords={"x": darr_coord}, dims=("x"))

darr["x"] does not share memory with some_array, which I check with np.shares_memory(darr["x"].data, some_array). This is surprising because another_array shares data with darr.data, and I'd expect coordinate to behave the same. Moreover, if I do:

darr.coords["x"] = ("x", some_array)

They do share the underlying data.

This is not a huge blocker for us, since the workaround is fairly straightforward, but I thought it may be an unexpected behavior.

What did you expect to happen?

I expect no copy to happen both for data and coordinates when I create DataArrays.

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np


def test_data_array_coord_array_not_copied_when_assigning():
    arr = np.random.random(2)
    arr_coord = np.arange(2)

    darr = xr.DataArray(arr, dims=("x"))
    assert darr.data is arr  # Variable is the same as input array.

    darr.coords["x"] = ("x", arr_coord)
    assert np.shares_memory(darr["x"].data, arr_coord)


def test_data_array_coord_array_not_copied_when_providing_at_initialization():
    arr = np.random.random(2)
    arr_coord = np.arange(2)

    darr_coord = xr.DataArray(arr_coord, dims="x")
    assert darr_coord.data is arr_coord  # Variable is the same as input array.

    darr = xr.DataArray(arr, coords={"x": darr_coord}, dims=("x"))
    assert darr.data is arr  # Variable is the same as input array.
    assert np.shares_memory(darr["x"].data, arr_coord)

test_data_array_coord_array_not_copied_when_assigning()  # <- Bueno!
test_data_array_coord_array_not_copied_when_providing_at_initialization()  # <- Sad :(

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.9.19 (main, Aug 14 2024, 05:11:09)
[Clang 18.1.8 ]
python-bits: 64
OS: Linux
OS-release: 6.8.0-51-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2024.7.0
pandas: 2.2.3
numpy: 2.0.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: None
pip: None
conda: None
pytest: 8.3.4
mypy: 1.13.0
IPython: None
sphinx: None
None

@ckutlu ckutlu added bug needs triage Issue that has not been reviewed by xarray team member labels Dec 19, 2024
Copy link

welcome bot commented Dec 19, 2024

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs triage Issue that has not been reviewed by xarray team member
Projects
None yet
Development

No branches or pull requests

1 participant