Skip to content

Commit e73712c

Browse files
authored
Quick Overview docs page xarray-contrib/datatree#62
* wrote quick overview page * extremely basic intallation instructions * version 0.0.1 * updated with .from_dict constructor change * linting
1 parent bc59b45 commit e73712c

File tree

5 files changed

+115
-9
lines changed

5 files changed

+115
-9
lines changed

docs/source/api.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@ DataTree
1111
:toctree: generated/
1212

1313
DataTree
14-
DataNode
1514

1615
Attributes
1716
----------
@@ -51,6 +50,7 @@ Methods
5150
.. autosummary::
5251
:toctree: generated/
5352

53+
DataTree.from_dict
5454
DataTree.load
5555
DataTree.compute
5656
DataTree.persist

docs/source/conf.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,8 @@
4545
"sphinx.ext.intersphinx",
4646
"sphinx.ext.extlinks",
4747
"sphinx.ext.napoleon",
48+
"IPython.sphinxext.ipython_console_highlighting",
49+
"IPython.sphinxext.ipython_directive",
4850
]
4951

5052
extlinks = {
@@ -76,9 +78,9 @@
7678
# built documents.
7779
#
7880
# The short X.Y version.
79-
version = "0.0.0" # datatree.__version__
81+
version = "0.0.1" # datatree.__version__
8082
# The full version, including alpha/beta/rc tags.
81-
release = "0.0.0" # datatree.__version__
83+
release = "0.0.1" # datatree.__version__
8284

8385
# The language for content autogenerated by Sphinx. Refer to documentation
8486
# for a list of supported languages.

docs/source/index.rst

+9-5
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,21 @@
11
Datatree
22
========
33

4-
**Datatree is a WIP implementation of a tree-like hierarchical data structure for xarray.**
4+
**Datatree is a prototype implementation of a tree-like hierarchical data structure for xarray.**
55

66

77
.. toctree::
88
:maxdepth: 2
99
:caption: Documentation Contents
1010

11-
installation
12-
tutorial
13-
api
14-
contributing
11+
Installation <installation>
12+
Quick Overview <quick-overview>
13+
Tutorial <tutorial>
14+
API Reference <api>
15+
How do I ... <howdoi>
16+
Contributing Guide <contributing>
17+
Development Roadmap <roadmap>
18+
GitHub repository <https://github.com/TomNicholas/datatree>
1519

1620
Feedback
1721
--------

docs/source/installation.rst

+18-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,21 @@
22
Installation
33
============
44

5-
Coming soon!
5+
Datatree is not yet available on pypi or via conda, so for now you will have to install it from source.
6+
7+
``git clone https://github.com/TomNicholas/datatree.git```
8+
9+
``pip install -e ./datatree/``
10+
11+
The main branch will be kept up-to-date, so if you clone main and run the test suite with ``pytest datatree`` and get no failures,
12+
then you have the most up-to-date version.
13+
14+
You will need xarray and `anytree <https://github.com/c0fec0de/anytree>`_
15+
as dependencies, with netcdf4, zarr, and h5netcdf as optional dependencies to allow file I/O.
16+
17+
.. note::
18+
19+
Datatree is very much still in the early stages of development. There may be functions that are present but whose
20+
internals are not yet implemented, or significant changes to the API in future.
21+
That said, if you try it out and find some behaviour that looks like a bug to you, please report it on the
22+
`issue tracker <https://github.com/TomNicholas/datatree/issues>`_!

docs/source/quick-overview.rst

+83
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
##############
2+
Quick overview
3+
##############
4+
5+
DataTrees
6+
---------
7+
8+
:py:class:`DataTree` is a tree-like container of ``DataArray`` objects, organised into multiple mutually alignable groups.
9+
You can think of it like a (recursive) ``dict`` of ``Dataset`` objects.
10+
11+
Let's first make some example xarray datasets (following on from xarray's
12+
`quick overview <https://docs.xarray.dev/en/stable/getting-started-guide/quick-overview.html>`_ page):
13+
14+
.. ipython:: python
15+
16+
import numpy as np
17+
import xarray as xr
18+
19+
data = xr.DataArray(np.random.randn(2, 3), dims=("x", "y"), coords={"x": [10, 20]})
20+
ds = xr.Dataset(dict(foo=data, bar=("x", [1, 2]), baz=np.pi))
21+
ds
22+
23+
ds2 = ds.interp(coords={"x": [10, 12, 14, 16, 18, 20]})
24+
ds2
25+
26+
ds3 = xr.Dataset(
27+
dict(people=["alice", "bob"], heights=("people", [1.57, 1.82])),
28+
coords={"species": "human"},
29+
)
30+
ds3
31+
32+
Now we'll put this data into a multi-group tree:
33+
34+
.. ipython:: python
35+
36+
from datatree import DataTree
37+
38+
dt = DataTree.from_dict(
39+
{"root/simulation/coarse": ds, "root/simulation/fine": ds2, "root": ds3}
40+
)
41+
print(dt)
42+
43+
This creates a datatree with various groups. We have one root group (named ``root``), containing information about individual people.
44+
The root group then has one subgroup ``simulation``, which contains no data itself but does contain another two subgroups,
45+
named ``fine`` and ``coarse``.
46+
47+
The (sub-)sub-groups ``fine`` and ``coarse`` contain two very similar datasets.
48+
They both have an ``"x"`` dimension, but the dimension is of different lengths in each group, which makes the data in each group unalignable.
49+
In (``root``) we placed some completely unrelated information, showing how we can use a tree to store heterogenous data.
50+
51+
The constraints on each group are therefore the same as the constraint on dataarrays within a single dataset.
52+
53+
We created the sub-groups using a filesystem-like syntax, and accessing groups works the same way.
54+
We can access individual dataarrays in a similar fashion
55+
56+
.. ipython:: python
57+
58+
dt["simulation/coarse/foo"]
59+
60+
and we can also pull out the data in a particular group as a ``Dataset`` object using ``.ds``:
61+
62+
.. ipython:: python
63+
64+
dt["simulation/coarse"].ds
65+
66+
Operations map over subtrees, so we can take a mean over the ``x`` dimension of both the ``fine`` and ``coarse`` groups just by
67+
68+
.. ipython:: python
69+
70+
avg = dt["simulation"].mean(dim="x")
71+
print(avg)
72+
73+
Here the ``"x"`` dimension used is always the one local to that sub-group.
74+
75+
You can do almost everything you can do with ``Dataset`` objects with ``DataTree`` objects
76+
(including indexing and arithmetic), as operations will be mapped over every sub-group in the tree.
77+
This allows you to work with multiple groups of non-alignable variables at once.
78+
79+
.. note::
80+
81+
If all of your variables are mutually alignable
82+
(i.e. they live on the same grid, such that every common dimension name maps to the same length),
83+
then you probably don't need :py:class:`DataTree`, and should consider just sticking with ``xarray.Dataset``.

0 commit comments

Comments
 (0)