Skip to content
This repository was archived by the owner on Oct 24, 2024. It is now read-only.

Commit 2cd32f4

Browse files
TomNicholaspre-commit-ci[bot]keewis
authored
Hierarchical data docs page (#179)
* why hierarchical data * add hierarchical data page to index * Simpsons family tree * evolutionary tree * WIP rearrangement of creating trees * fixed examples in data structures page * dict-like navigation * filesystem-like paths explained * split PR into parts * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs/source/data-structures.rst Co-authored-by: Justus Magin <[email protected]> * black * whatsnew * get assign example working * fix some links to methods * relative_to example Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Justus Magin <[email protected]>
1 parent d76f050 commit 2cd32f4

File tree

4 files changed

+359
-45
lines changed

4 files changed

+359
-45
lines changed

docs/source/data-structures.rst

+24-45
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ Again these are not normally used unless explicitly accessed by the user.
7171
Creating a DataTree
7272
~~~~~~~~~~~~~~~~~~~
7373

74-
There are three ways to create a ``DataTree`` from scratch. The first is to create each node individually,
74+
One way to create a create a ``DataTree`` from scratch is to create each node individually,
7575
specifying the nodes' relationship to one another as you create each one.
7676

7777
The ``DataTree`` constructor takes:
@@ -81,73 +81,55 @@ The ``DataTree`` constructor takes:
8181
- ``children``: The various child nodes (if there are any), given as a mapping from string keys to ``DataTree`` objects.
8282
- ``name``: A string to use as the name of this node.
8383

84-
Let's make a datatree node without anything in it:
84+
Let's make a single datatree node with some example data in it:
8585

8686
.. ipython:: python
8787
8888
from datatree import DataTree
8989
90-
# create root node
91-
node1 = DataTree(name="Oak")
90+
ds1 = xr.Dataset({"foo": "orange"})
91+
dt = DataTree(name="root", data=ds1) # create root node
9292
93-
node1
93+
dt
9494
9595
At this point our node is also the root node, as every tree has a root node.
9696

9797
We can add a second node to this tree either by referring to the first node in the constructor of the second:
9898

9999
.. ipython:: python
100100
101+
ds2 = xr.Dataset({"bar": 0}, coords={"y": ("y", [0, 1, 2])})
101102
# add a child by referring to the parent node
102-
node2 = DataTree(name="Bonsai", parent=node1)
103+
node2 = DataTree(name="a", parent=dt, data=ds2)
103104
104105
or by dynamically updating the attributes of one node to refer to another:
105106

106107
.. ipython:: python
107108
108-
# add a grandparent by updating the .parent property of an existing node
109-
node0 = DataTree(name="General Sherman")
110-
node1.parent = node0
109+
# add a second child by first creating a new node ...
110+
ds3 = xr.Dataset({"zed": np.NaN})
111+
node3 = DataTree(name="b", data=ds3)
112+
# ... then updating its .parent property
113+
node3.parent = dt
111114
112-
Our tree now has three nodes within it, and one of the two new nodes has become the new root:
115+
Our tree now has three nodes within it:
113116

114117
.. ipython:: python
115118
116-
node0
119+
dt
117120
118-
Is is at tree construction time that consistency checks are enforced. For instance, if we try to create a `cycle` the constructor will raise an error:
121+
It is at tree construction time that consistency checks are enforced. For instance, if we try to create a `cycle` the constructor will raise an error:
119122

120123
.. ipython:: python
121124
:okexcept:
122125
123-
node0.parent = node2
124-
125-
The second way is to build the tree from a dictionary of filesystem-like paths and corresponding ``xarray.Dataset`` objects.
126-
127-
This relies on a syntax inspired by unix-like filesystems, where the "path" to a node is specified by the keys of each intermediate node in sequence,
128-
separated by forward slashes. The root node is referred to by ``"/"``, so the path from our current root node to its grand-child would be ``"/Oak/Bonsai"``.
129-
A path specified from the root (as opposed to being specified relative to an arbitrary node in the tree) is sometimes also referred to as a
130-
`"fully qualified name" <https://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf-zarr-data-model-specification#nczarr_fqn>`_.
131-
132-
If we have a dictionary where each key is a valid path, and each value is either valid data or ``None``,
133-
we can construct a complex tree quickly using the alternative constructor ``:py:func::DataTree.from_dict``:
126+
dt.parent = node3
134127
135-
.. ipython:: python
136-
137-
d = {
138-
"/": xr.Dataset({"foo": "orange"}),
139-
"/a": xr.Dataset({"bar": 0}, coords={"y": ("y", [0, 1, 2])}),
140-
"/a/b": xr.Dataset({"zed": np.NaN}),
141-
"a/c/d": None,
142-
}
143-
dt = DataTree.from_dict(d)
144-
dt
128+
Alternatively you can also create a ``DataTree`` object from
145129

146-
Notice that this method will also create any intermediate empty node necessary to reach the end of the specified path
147-
(i.e. the node labelled `"c"` in this case.)
148-
149-
Finally the third way is from a file. if you have a file containing data on disk (such as a netCDF file or a Zarr Store), you can also create a datatree by opening the
150-
file using ``:py:func::~datatree.open_datatree``. See the page on :ref:`reading and writing files <io>` for more details.
130+
- An ``xarray.Dataset`` using ``Dataset.to_node()`` (not yet implemented),
131+
- A dictionary mapping directory-like paths to either ``DataTree`` nodes or data, using ``DataTree.from_dict()``,
132+
- A netCDF or Zarr file on disk with ``open_datatree()``. See :ref:`reading and writing files <io>`.
151133

152134

153135
DataTree Contents
@@ -187,20 +169,17 @@ Like with ``Dataset``, you can access the data and coordinate variables of a nod
187169
Dictionary-like methods
188170
~~~~~~~~~~~~~~~~~~~~~~~
189171

190-
We can update the contents of the tree in-place using a dictionary-like syntax.
191-
192172
We can update a datatree in-place using Python's standard dictionary syntax, similar to how we can for Dataset objects.
193173
For example, to create this example datatree from scratch, we could have written:
194174

195175
# TODO update this example using ``.coords`` and ``.data_vars`` as setters,
196176

197177
.. ipython:: python
198178
199-
dt = DataTree()
179+
dt = DataTree(name="root")
200180
dt["foo"] = "orange"
201181
dt["a"] = DataTree(data=xr.Dataset({"bar": 0}, coords={"y": ("y", [0, 1, 2])}))
202182
dt["a/b/zed"] = np.NaN
203-
dt["a/c/d"] = DataTree()
204183
dt
205184
206185
To change the variables in a node of a ``DataTree``, you can use all the standard dictionary
@@ -209,6 +188,6 @@ methods, including ``values``, ``items``, ``__delitem__``, ``get`` and
209188
Note that assigning a ``DataArray`` object to a ``DataTree`` variable using ``__setitem__`` or ``update`` will
210189
:ref:`automatically align<update>` the array(s) to the original node's indexes.
211190

212-
If you copy a ``DataTree`` using the ``:py:func::copy`` function or the :py:meth:`~xarray.DataTree.copy` it will copy the entire tree,
213-
including all parents and children.
214-
Like for ``Dataset``, this copy is shallow by default, but you can copy all the data by calling ``dt.copy(deep=True)``.
191+
If you copy a ``DataTree`` using the ``:py:func::copy`` function or the :py:meth:`~xarray.DataTree.copy` it will copy the subtree,
192+
meaning that node and children below it, but no parents above it.
193+
Like for ``Dataset``, this copy is shallow by default, but you can copy all the underlying data arrays by calling ``dt.copy(deep=True)``.

0 commit comments

Comments
 (0)