Skip to content

Latest commit

 

History

History
120 lines (95 loc) · 4.77 KB

File metadata and controls

120 lines (95 loc) · 4.77 KB

cldfviz.tree

The cldfviz.tree command wraps functionality provided in the Python package toytree to plot (phylogenetic) trees available in Newick format in SVG format (which is amenable to post-processing using tools like Inkscape).

Tree specification

In the simplest case, the tree in Newick format is available as string. We can use the command's --ascii-art flag for a simple check of the tree topology on the command line:

cldfbench cldfviz.tree --tree "(A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;" --ascii-art
    ┌─A
──F─┼─B
    │   ┌─C
    └─E─┤
        └─D

The SVG tree will display proper branch lenghts, of course:

cldfbench cldfviz.tree --tree "(A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;" --output tree.svg --open

CLDF datasets may also contain language trees. As an example, we'll plot the classification of Glottolog's Ta-Ne-Omotic family. This classification tree is available in glottolog-cldf v4.7 (which we assume was downloaded and unzipped).

cldfbench cldfviz.tree --tree-dataset glottolog-cldf-4.7/ --tree-id gong1255 \
--output tree.svg --open --width 600

Note that Glottolog's classification trees do not contain meaningful branch lengths.

Tree styling

We tap into toytree's styling options, by allowing specification of a Python dict object providing keyword arguments for toytree's draw method. Providing Python code as argument on the commandline is somewhat fragile, so we can just specify the path to a file containing the code.

With styles defined in a local file styles.py with the following content

dict(
    width=800,
    node_labels='name',
    node_labels_style={"font-size": "10px"},
    node_markers="r10x1.25",
    node_sizes=12,
    node_style={
        "fill": "lightgrey",
        "stroke": "black",
        "stroke-width": 0.75,
    }
)

we can run

cldfbench cldfviz.tree --tree-dataset glottolog-cldf-4.7/ --tree-id gong1255 --output tree.svg \
--styles styles.py --name-as-label --open

If the level of styling provided by toytree isn't sufficient, the SVG can easily be manipulated, e.g. to change aspect ratio, remove the scalebar or colorize individual nodes:

Plotting data on a tree

Plotting language data (such as typological features) on a map is a common way to visualize the potential influence of geography on data. Similarly, plotting data on a genealogical tree can shed light on the influence of genealogy.

Now, there isn't too much variation in Ta-Ne-Omotic for WALS features, so we'll switch to investigating vowel nasalization for Indo-European languages:

cldfbench cldfviz.tree --tree-dataset glottolog-cldf-4.7/ --tree-id indo1319 \
--data-dataset wals-2020.3/ --parameters 10A --output tree.svg \
--tree-label-property Glottocode --name-as-label --open

Notes:

  • We specified a second CLDF dataset to lookup the data with --data-dataset.
  • To make sure that Glottocodes are used to match WALS languages to tree labels we specified --tree-label-property Glottocode.
  • The --name-as-label flag now will use language names from the "data dataset".
  • The tree was pruned to just the languages for which WALS has data on the selected feature.

As with cldfviz.map, you can also choose multiple parameters:

cldfbench cldfviz.tree --tree-dataset glottolog-cldf-4.7/ --tree-id indo1319 \
--data-dataset wals-2020.3/ --parameters 10A,11A --colormaps tol,boynton \
--output tree.svg --tree-label-property Glottocode --name-as-label --open

Other options

We can also use cldfviz.tree to compare the Glottolog classification with WALS, where Ta-Ne-Omotic is a genus within Afro-Asiatic. WALS' Afro-Asiatic genealogy is available in WALS Online's CLDF dataset, v2020.3. We use the --language-filters option to prune the large Afro-Asiatic tree to the relevant part for comparison (and re-use the styles from above for consistent layout):

cldfbench cldfviz.tree --tree-dataset wals-2020.3/ --tree-id family-afroasiatic --output tree.svg \
--name-as-label --language-filters '{"Subfamily":"Omotic"}' --open