-
Notifications
You must be signed in to change notification settings - Fork 82
/
Copy pathUngulates.Rmd
160 lines (122 loc) · 5.25 KB
/
Ungulates.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
---
title: "Ungulate traits and ecogeography"
author: "Rutger Vos (@rvosa)"
date: "7-12-2018"
output:
word_document: default
html_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Trait and geospatial data
We have a [tab-separated table](ungulates-continuous.tsv) containing continuous-valued trait data
(including latitude and longitude values) from the
[PanTHERIA](http://esapubs.org/archive/ecol/E090/184/PanTHERIA_1-0_WR05_Aug2008.txt) data base,
filtered on the Ungulates, i.e. the
[Artiodactyla](http://www.departments.bucknell.edu/biology/resources/msw3/browse.asp?id=14200001) and the
[Perissodactyla](http://www.departments.bucknell.edu/biology/resources/msw3/browse.asp?id=14100001).
Here we read that table into a data frame:
```{r readtsv}
# the table has a header, and the 1st column contains the observation names: species
df <- read.table("ungulates-continuous.tsv", sep = "\t", header = T, row.names = 1)
```
We can plot the mid points of the latitudinal and longitudinal ranges of the species, on
an `rworldmap`. Here we load the library, which emits a welcome message:
```{r loadworldmap}
library(rworldmap, quietly = T)
```
Now we plot the map:
```{r plotmap}
newmap <- getMap(resolution = "low")
plot(newmap, asp = 1)
# notice the two columns in the data frame we read
points(df$X26.7_GR_MidRangeLong_dd, df$X26.4_GR_MidRangeLat_dd, col = "red", cex = .6)
```
## Phylogeny
We also have a [phylogenetic tree](ungulates.nwk) in
[Newick](https://github.com/naturalis/mebioda/blob/master/doc/week1/w1d5/lecture1.md#the-newick--new-hampshire-format)
format. It is the topology of a [supertree of the Mammals](http://doi.org/10.1038/nature05634),
but pruned to the samen taxon set as the trait table. To read that tree, we need to load the `ape` package:
```{r readnwk}
library(ape, quietly = T)
tree <- read.tree(file = "ungulates.nwk")
```
The tree is a `phylo` object, a pseudo-standard in the R community that a number of packages build on. For example,
we can plot the tree against a geological timescale using the `strap` package:
```{r plotphylo}
library(strap, quietly = T)
# set the age of the root to absolute time
tree$root.time <- 88.5
geoscalePhylo(tree, units = "Epoch", cex.age = 1, cex.ts = 0.5, boxes = T, show.tip.label = F)
```
## Combining traits and trees
We can combine the trait data and the tree and plot trait values onto the tree. For example,
here is the range size of the species, for which we need `phytools`:
```{r plotmass}
library(phytools, quietly = T)
# make a vector of the area column, keep records in tree, remove records with NA values
log_area <- log(as.vector(df$X26.1_GR_Area_km2))
names(log_area) <- row.names(df)
log_area <- log_area[tree$tip.label]
log_area <- log_area[!is.na(log_area)]
# drop the tips that had NA records for area, remove records not in tree
tips_to_prune <- setdiff(tree$tip.label,names(log_area))
pruned <- drop.tip(tree,tips_to_prune)
# do a continuous trait mapping, of log-transformed mass
contMap(pruned,log_area,standardize=TRUE,length=10,outline=F,lwd=3,ftype="off")
```
## Interaction between traits
We can plot to see how latitude relates to range size:
```{r make_vectors}
# make a vector of absolute latitude, give it names, keep records in tree, remove NA values
abs_lat <- abs(df$X26.4_GR_MidRangeLat_dd)
names(abs_lat) <- row.names(df)
abs_lat <- abs_lat[tree$tip.label]
abs_lat <- abs_lat[!is.na(abs_lat)]
# keep the intersection for each of the two variables
abs_lat <- abs_lat[intersect(names(abs_lat),names(log_area))]
log_area <- log_area[intersect(names(abs_lat),names(log_area))]
```
The INCORRECT approach would be to just plot the two variables against
each other even though there is autocorrelation. Here is what that
would look like:
```{r plotwrong}
fitwrong <- lm(log_area ~ abs_lat)
plot(abs_lat,log_area)
abline(fitwrong)
```
We can connect the data points by their relationships in the phylogeny:
```{r phylomorphospace}
# drop the tips that had NA records for body mass, randomly resolve
tips_to_prune <- setdiff(tree$tip.label,names(log_area))
pruned <- multi2di(drop.tip(tree,tips_to_prune))
phylomorphospace(pruned,data.frame(abs_lat,log_area),label = "off")
```
It looks like the placement of the points in morphospace may be shaped
by phylogeny. One way to deal with such data is to compute
[independent contrasts](https://en.wikipedia.org/wiki/Phylogenetic_comparative_methods#Phylogenetically_independent_contrasts)
and do subsequent linear modeling on those:
```{r ic}
# compute independent contrasts
abs_lat_contrasts <-pic(abs_lat,pruned)
log_area_contrasts <- pic(log_area,pruned)
# fit linear model
fit <- lm(log_area_contrasts ~ abs_lat_contrasts + 0)
plot(abs_lat_contrasts,log_area_contrasts)
abline(fit)
```
There appears to be correlation:
```{r fitsumm}
summary(fit)
```
## Scaling rules
The result of our analysis shows support for
[Rapoport's rule](https://en.wikipedia.org/wiki/Rapoport%27s_rule): an ecogeographical
rule that states that latitudinal ranges of plants and animals are generally smaller at
lower latitudes than at higher latitudes. In addition to this correlation, our data set
surely shows other correlations, such as trivial scalings:
- Body mass ~ body length
- Longevity ~ body mass
- Neonate body mass ~ gestation length