-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Dataset internals to store data variables and coordinate variables as separate dicts #9203
Comments
In fact could the (idea from #9204 (comment)) |
I think |
I was wondering about that already quite some time, haha. Does this mean that handling of coordinates can be handled in the parent class |
You mean have all of |
The refactor of Dataset internals may (should?) be an opportunity to also refactor coordinates and indexes proxies. The current situation is quite messy and fragile:
A cleaner model might be (from #7368 (comment)):
This would represent a much bigger refactor, though. One difficulty with this model is also that updating coordinates cannot be done totally independently of data variables (alignment). |
This sounds great, but bear in mind that that plan was conceived pre-datatree-in-xarray. The xarray/xarray/core/datatree.py Line 564 in 4c2540d
|
I am very much aligned with this revised data model!
Coordinates on DataTree will still need to use a chainmap, but hopefully
this is a simple variation of the base Coordinates class. Maybe even the
same object, just initialized with the chainmap mappings instead of dicts.
…On Tue, Mar 11, 2025 at 2:50 PM Tom Nicholas ***@***.***> wrote:
A cleaner model might be (from #7368 (comment)
<#7368 (comment)>):
- Have one true stand-alone Coordinates container that stores data
(coordinate variables, indexes and dimension sizes) and that is
encapsulated in Dataset and DataArray (instead of having _coord_names,
_coords, _indexes, etc. internal dictionaries)
This sounds great, but bear in mind that that plan was conceived
pre-datatree-in-xarray. The DataTree class also has these attributes, but
to implement the "coordinate inheritance" feature they often actually use
ChainMap to dynamically construct them from coordinates stored on parent
tree nodes. Any refactor design should play nicely with this too. See this
part of the code:
https://github.com/pydata/xarray/blob/4c2540de54af8d9da4456715d7dcac6b16f330d8/xarray/core/datatree.py#L564
—
Reply to this email directly, view it on GitHub
<#9203 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJJFVUYP47YXDW6UEVLQXD2T4V6VAVCNFSM6AAAAABYZHMU2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMJVGM4TMOBUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[image: TomNicholas]*TomNicholas* left a comment (pydata/xarray#9203)
<#9203 (comment)>
A cleaner model might be (from #7368 (comment)
<#7368 (comment)>):
- Have one true stand-alone Coordinates container that stores data
(coordinate variables, indexes and dimension sizes) and that is
encapsulated in Dataset and DataArray (instead of having _coord_names,
_coords, _indexes, etc. internal dictionaries)
This sounds great, but bear in mind that that plan was conceived
pre-datatree-in-xarray. The DataTree class also has these attributes, but
to implement the "coordinate inheritance" feature they often actually use
ChainMap to dynamically construct them from coordinates stored on parent
tree nodes. Any refactor design should play nicely with this too. See this
part of the code:
https://github.com/pydata/xarray/blob/4c2540de54af8d9da4456715d7dcac6b16f330d8/xarray/core/datatree.py#L564
—
Reply to this email directly, view it on GitHub
<#9203 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJJFVUYP47YXDW6UEVLQXD2T4V6VAVCNFSM6AAAAABYZHMU2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMJVGM4TMOBUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There's a lot of other discussion in #9063, but I wanted to pull out this suggestion for independent discussion:
Originally posted by @shoyer in #9063 (comment)
The text was updated successfully, but these errors were encountered: