You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 24, 2024. It is now read-only.
* why hierarchical data
* add hierarchical data page to index
* Simpsons family tree
* evolutionary tree
* WIP rearrangement of creating trees
* fixed examples in data structures page
* dict-like navigation
* filesystem-like paths explained
* split PR into parts
* plan
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix ipython bug
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* filter simpsons family tree by age
* use new filter method
* test about filter
* simple example of mapping over a subtree
* ideas for docs on iterating over trees
* add section on iterating over subtree
* text to accompany Simpsons family aging example
* add voltage dataset
* RMS as example of mapping custom computation
* isomorphism
* P=IV example of binary multiplication
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unfinished sections
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* whatsnew
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy file name to clipboardexpand all lines: docs/source/hierarchical-data.rst
+260-1
Original file line number
Diff line number
Diff line change
@@ -175,7 +175,7 @@ Let's use a different example of a tree to discuss more complex relationships be
175
175
]
176
176
177
177
We have used the :py:meth:`~DataTree.from_dict` constructor method as an alternate way to quickly create a whole tree,
178
-
and :ref:`filesystem-like syntax <filesystem paths>`_ (to be explained shortly) to select two nodes of interest.
178
+
and :ref:`filesystempaths` (to be explained shortly) to select two nodes of interest.
179
179
180
180
.. ipython:: python
181
181
@@ -339,3 +339,262 @@ we can construct a complex tree quickly using the alternative constructor :py:me
339
339
Notice that using the path-like syntax will also create any intermediate empty nodes necessary to reach the end of the specified path
340
340
(i.e. the node labelled `"c"` in this case.)
341
341
This is to help avoid lots of redundant entries when creating deeply-nested trees using :py:meth:`DataTree.from_dict`.
342
+
343
+
.. _iterating over trees:
344
+
345
+
Iterating over trees
346
+
~~~~~~~~~~~~~~~~~~~~
347
+
348
+
You can iterate over every node in a tree using the subtree :py:class:`~DataTree.subtree` property.
349
+
This returns an iterable of nodes, which yields them in depth-first order.
350
+
351
+
.. ipython:: python
352
+
353
+
for node in vertebrates.subtree:
354
+
print(node.path)
355
+
356
+
A very useful pattern is to use :py:class:`~DataTree.subtree` conjunction with the :py:class:`~DataTree.path` property to manipulate the nodes however you wish,
357
+
then rebuild a new tree using :py:meth:`DataTree.from_dict()`.
358
+
359
+
For example, we could keep only the nodes containing data by looping over all nodes,
360
+
checking if they contain any data using :py:class:`~DataTree.has_data`,
361
+
then rebuilding a new tree using only the paths of those nodes:
362
+
363
+
.. ipython:: python
364
+
365
+
non_empty_nodes = {node.path: node.ds for node in dt.subtree if node.has_data}
366
+
DataTree.from_dict(non_empty_nodes)
367
+
368
+
You can see this tree is similar to the ``dt`` object above, except that it is missing the empty nodes ``a/c`` and ``a/c/d``.
369
+
370
+
(If you want to keep the name of the root node, you will need to add the ``name`` kwarg to :py:class:`from_dict`, i.e. ``DataTree.from_dict(non_empty_nodes, name=dt.root.name)``.)
371
+
372
+
.. _manipulating trees:
373
+
374
+
Manipulating Trees
375
+
------------------
376
+
377
+
Subsetting Tree Nodes
378
+
~~~~~~~~~~~~~~~~~~~~~
379
+
380
+
We can subset our tree to select only nodes of interest in various ways.
381
+
382
+
The :py:meth:`DataTree.filter` method can be used to retain only the nodes of a tree that meet a certain condition.
383
+
For example, we could recreate the Simpson's family tree with the ages of each individual, then filter for only the adults:
384
+
First lets recreate the tree but with an `age` data variable in every node:
385
+
386
+
.. ipython:: python
387
+
388
+
simpsons = DataTree.from_dict(
389
+
d={
390
+
"/": xr.Dataset({"age": 83}),
391
+
"/Herbert": xr.Dataset({"age": 40}),
392
+
"/Homer": xr.Dataset({"age": 39}),
393
+
"/Homer/Bart": xr.Dataset({"age": 10}),
394
+
"/Homer/Lisa": xr.Dataset({"age": 8}),
395
+
"/Homer/Maggie": xr.Dataset({"age": 1}),
396
+
},
397
+
name="Abe",
398
+
)
399
+
simpsons
400
+
401
+
Now let's filter out the minors:
402
+
403
+
.. ipython:: python
404
+
405
+
simpsons.filter(lambdanode: node["age"] >18)
406
+
407
+
The result is a new tree, containing only the nodes matching the condition.
408
+
409
+
(Yes, under the hood :py:meth:`~DataTree.filter` is just syntactic sugar for the pattern we showed you in :ref:`iterating over trees` !)
410
+
411
+
.. _tree computation:
412
+
413
+
Computation
414
+
-----------
415
+
416
+
`DataTree` objects are also useful for performing computations, not just for organizing data.
417
+
418
+
Operations and Methods on Trees
419
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
420
+
421
+
To show how applying operations across a whole tree at once can be useful,
422
+
let's first create a example scientific dataset.
423
+
424
+
.. ipython:: python
425
+
426
+
deftime_stamps(n_samples, T):
427
+
"""Create an array of evenly-spaced time stamps"""
Most xarray computation methods also exist as methods on datatree objects,
474
+
so you can for example take the mean value of these two timeseries at once:
475
+
476
+
.. ipython:: python
477
+
478
+
voltages.mean(dim="time")
479
+
480
+
This works by mapping the standard :py:meth:`xarray.Dataset.mean()` method over the dataset stored in each node of the
481
+
tree one-by-one.
482
+
483
+
The arguments passed to the method are used for every node, so the values of the arguments you pass might be valid for one node and invalid for another
484
+
485
+
.. ipython:: python
486
+
:okexcept:
487
+
488
+
voltages.isel(time=12)
489
+
490
+
Notice that the error raised helpfully indicates which node of the tree the operation failed on.
491
+
492
+
Arithmetic Methods on Trees
493
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
494
+
495
+
Arithmetic methods are also implemented, so you can e.g. add a scalar to every dataset in the tree at once.
496
+
For example, we can advance the timeline of the Simpsons by a decade just by
497
+
498
+
.. ipython:: python
499
+
500
+
simpsons +10
501
+
502
+
See that the same change (fast-forwarding by adding 10 years to the age of each character) has been applied to every node.
503
+
504
+
Mapping Custom Functions Over Trees
505
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
506
+
507
+
You can map custom computation over each node in a tree using :py:func:`map_over_subtree`.
508
+
You can map any function, so long as it takes `xarray.Dataset` objects as one (or more) of the input arguments,
509
+
and returns one (or more) xarray datasets.
510
+
511
+
.. note::
512
+
513
+
Functions passed to :py:func:`map_over_subtree` cannot alter nodes in-place.
514
+
Instead they must return new `xarray.Dataset` objects.
515
+
516
+
For example, we can define a function to calculate the Root Mean Square of a timeseries
517
+
518
+
.. ipython:: python
519
+
520
+
defrms(signal):
521
+
return np.sqrt(np.mean(signal**2))
522
+
523
+
Then calculate the RMS value of these signals:
524
+
525
+
.. ipython:: python
526
+
527
+
rms(readings)
528
+
529
+
.. _multiple trees:
530
+
531
+
Operating on Multiple Trees
532
+
---------------------------
533
+
534
+
The examples so far have involved mapping functions or methods over the nodes of a single tree,
535
+
but we can generalize this to mapping functions over multiple trees at once.
536
+
537
+
Comparing Trees for Isomorphism
538
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
539
+
540
+
For it to make sense to map a single non-unary function over the nodes of multiple trees at once,
541
+
each tree needs to have the same structure. Specifically two trees can only be considered similar, or "isomorphic",
542
+
if they have the same number of nodes, and each corresponding node has the same number of children.
543
+
We can check if any two trees are isomorphic using the :py:meth:`DataTree.isomorphic` method.
0 commit comments