Merge pull request #10 from tomwallis/master

added extreme value analysis
philippberens · Oct 16, 2016 · 6ce30ae · 6ce30ae
2 parents 94baef5 + f5098e1
commit 6ce30ae
Show file tree

Hide file tree

Showing 10 changed files with 626 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -3,9 +3,11 @@ Re-analysis of data from Dong et al.: [Evidence for a limit to human lifespan] (
 
 [Philipp Berens](http://www.berenslab.org) and [Tom Wallis](http://tomwallis.info)
 
-In their paper, Dong et al. claim to provide statistical evidence for an absolute limit to human lifespan. Here we re-analyze the data from their central figure using model comparison. The data from the central figure of the paper provides no support for or against their model of saturation / decline relative to a simple linear increase, but supplementary data recovered from [the Gerontological Research Group](http://www.grg.org/Adams/A.HTM) *does* provide evidence for their model. Nevertheless, it would be ultimately more appropriate to model these data using Extreme Value Theory. In our opinion, such statistical comparisons should have been part of the Dong et al. paper.
+In their paper, Dong et al. claim to provide statistical evidence for an absolute limit to human lifespan. Here we re-analyze the data from their central figure using model comparison. The data from the central figure of the paper provides no support for or against their model of saturation / decline relative to a simple linear increase, but supplementary data recovered from [the Gerontological Research Group](http://www.grg.org/Adams/A.HTM) *does* provide evidence for their model (see our [R report](analysis.md)). 
 
-For a detailed analysis, see our [R report](analysis.md).
+It would be ultimately more appropriate to model these data using Extreme Value Theory. We have done so [here](extreme_value_analysis.md). While the EVA models clearly fit the data better then the linear models, they did not change the core conclusion above: the GRG data provide evidence for the authors' trend break model over a linear model. The data from Dong et al Fig 2a conversely provide support for the linear model.
+
+In our opinion, such statistical comparisons should have been part of the Dong et al. paper.
 
 [![DOI](https://zenodo.org/badge/70421471.svg)](https://zenodo.org/badge/latestdoi/70421471)
 

diff --git a/extreme_value_analysis.Rmd b/extreme_value_analysis.Rmd
@@ -0,0 +1,250 @@
+---
+title: "Extreme value analysis of “Evidence for a limit to Human Lifespan"
+author: "Tom Wallis and Philipp Berens"
+date: "15 October 2016"
+output: github_document
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+library(ggplot2)
+library(dplyr)
+library(extRemes)
+
+```
+
+
+```{r load_data, echo=FALSE}
+# load data from figure 2a:
+mrad_idl <- read.csv('lifeexpectancy.csv')
+mrad_idl <- round(mrad_idl)
+mrad_idl$Group <- factor(mrad_idl$Year>=1995, levels = c("FALSE", "TRUE"), labels = c("<1995", ">=1995"))
+
+load("GRG_data.RData")
+
+mrad_grg <- grg %>% 
+  select(-deathDatePX) %>% 
+  rename(Year = deathYear) %>% 
+  filter(Year >= 1950) %>% 
+  group_by(Year) %>% 
+  summarise(Age = max(Age)) %>% 
+  mutate(Group = factor(Year>=1995, 
+                        levels = c("FALSE", "TRUE"), 
+                        labels = c("<1995", ">=1995")))
+
+
+# the full grg data for all years:
+mrad_grg_full <- grg %>% 
+  select(-deathDatePX) %>% 
+  rename(Year = deathYear) %>% 
+  group_by(Year) %>% 
+  summarise(Age = max(Age)) %>% 
+  mutate(Group = factor(Year>=1995, 
+                        levels = c("FALSE", "TRUE"), 
+                        labels = c("<1995", ">=1995")))
+
+
+```
+
+
+
+In our [earlier analysis](https://github.com/philippberens/lifespan/blob/master/analysis.md), we compared the "trend break" model of Dong et al to a simple linear model. 
+We found that the dataset reported by the authors in their Fig 2a (from the [IDL Database](http://www.supercentenarians.org/)) offered equivocal support for their model relative to a simple linear model. 
+However, compiling a more complete version of the age-of-death of "verified supercentenarians" (from the Gerontology Research Group [website](http://www.grg.org/Adams/A.HTM)) *did* provide support for their model relative to a simple linear one.
+These data, under these models, support the authors' claim that human maximum lifespan may be saturating or decreasing.
+
+```{r gaussian models}
+idl_linear_model <- lm(Age ~ Year, mrad_idl)
+idl_author_model <- lm(Age ~ Year * Group, mrad_idl)
+
+# prepare data:
+tmp1 <- mrad_idl
+tmp2 <- mrad_idl
+tmp1$model <- "Trend break"
+tmp1$yhat <- predict(idl_author_model)
+tmp2$model <- "Linear"
+tmp2$yhat <- predict(idl_linear_model)
+
+combined_data <- rbind(tmp1, tmp2)
+
+plt <- ggplot(combined_data, aes(x = Year, y = Age, colour = Group)) + 
+  facet_grid(~ model) + 
+  geom_point() +
+  geom_line(aes(y = yhat))
+
+# appearance:
+plt <- plt + 
+  scale_y_continuous(name = "Yearly maximum reported age at death (years)",
+                     breaks = seq(110, 126, by = 2)) + 
+  ylab("Yearly maximum reported age at death (years)") +
+  theme_minimal(base_size = 8) + 
+  theme(panel.grid.major = element_line(colour = "grey90", size = 0.5)) + 
+  theme(panel.margin = unit(2, "lines")) + 
+  scale_color_manual(values = c("#4A87CB", "#E76826"), name = "") + 
+  theme(legend.position = "bottom")
+
+plt
+
+```
+
+However, the data under consideration are the yearly *maxima* of a much larger distribution (age-at-death for the human population).
+The models considered above assume that the data will be subject to noise that is Gaussian distributed.
+A plot of the residuals of the model fit shows that this assumption isn't met: there are three datapoints that lie well outside the range expected by the error model.
+
+```{r author_model_assumptions}
+plot(idl_author_model, which = 1)
+
+```
+
+This is very likely because of the meaning of the underlying data: they are *extreme values* drawn from a larger distribution.
+Thus, the models assumed above are not appropriate for modelling these data.
+
+# Extreme value analysis
+
+The appropriate approach to statistical inference based on these data is to use extreme value analysis (e.g. Coles, 2001), which seeks to model data with a low probability of occurrence (for example, a person living over the age of 110).
+Specifically, the data at hand are an example of *block maxima*, in which the maximum is computed for each "block" (a year in this case) of a distribution (here, the distribution of age-at-death).
+The distribution of block maxima are known to follow a Generalised Extreme Value (GEV) distribution as the number of blocks approaches infinity (see Coles, 2001 or Gilleland & Katz, 2016).
+The GEV has three parameters: a location parameter that defines the centre of the distribution, a scale that defines its spread, and a shape parameter that defines the weight of the tails.
+Specifically, if shape = 0, the GEV is the Gumbel distribution; if shape > 0 the distribution is Frechet and has heavier tails; if shape < 0 the distribution is reverse Weibull and has a bounded upper tail.
+
+# IDL dataset from Dong et al Fig 2a
+
+Fitting a GEV model to the MRAD data from the IDL set (using the `extRemes` package in R; Gilleland & Katz (2016)) produces:
+
+```{r idl_stationary, fig.width=8, fig.height=7}
+
+idl_fit1 <- fevd(mrad_idl$Age, mrad_idl, units = "MRAD")
+summary(idl_fit1)
+
+```
+
+We can see that the location parameter is `r signif(distill(idl_fit1)["location"], 2)`, which is plausible given the range of the data.
+The shape parameter is 0.10 (with standard error 0.13), indicating that the distribution of values here is approximately Gumbel.
+
+<!-- We can also see that the AIC of this model is 146 and the BIC is 150.6. -->
+<!-- Compared to this, the authors' model assuming Gaussian noise is better according to AIC (`r signif(AIC(idl_author_model), 2)`) and about the same according to BIC (`r signif(BIC(idl_author_model), 2)`). -->
+
+
+```{r plot_stationary, fig.width=8, fig.height=7}
+plot(idl_fit1)
+```
+
+The diagnostic plots above show that the quantiles of the model correspond reasonably to the empirical quantiles.
+
+## Linear model
+
+The model above assumes that the distribution of maximum age is stationary (that is, does not change as a function of year). 
+The `extRemes` package allows the specification of linear models for each parameter of the GEV. That is, we can test whether the location parameter (the centre of the distribution of maximum ages) has some relationship with Year.
+
+Dong et al claimed that maximum age saturates or decreases after 1995.
+A simple comparison model is of a linear increase in maximum age. 
+While the predictions of this model for large extrapolations are debatable, we think it serves as a good starting point.
+
+
+```{r linear_model, fig.width=8, fig.height=7}
+idl_fit2 <- fevd(mrad_idl$Age, mrad_idl, location.fun = ~ Year)  # linear increase
+summary(idl_fit2)
+
+```
+
+We can see that this model shows a positive slope for the location parameter of the GEV (the centre of the distribution of maximum ages). 
+This model substantially beats the stationary model in terms of AIC and BIC. Because these are nested (the stationary model has just an intercept term on the location parameter), we can also perform a likelihood ratio test of the two models:
+
+```{r lm_stationary_lr_test}
+lr.test(idl_fit1, idl_fit2)
+
+```
+
+This agrees with the AIC and BIC, indicating strong support for the linear model.
+
+## Trend break model
+
+What about the Dong et al model, with a "trend break" at 1995?
+
+```{r trend_break_model, fig.width=8, fig.height=7}
+idl_fit3 <- fevd(mrad_idl$Age, mrad_idl, location.fun = ~ Year * Group)  # trend break
+summary(idl_fit3)
+
+```
+
+The AIC for this model is larger than the linear model by `r signif(137.8595 - 134.7466, 2)`, indicating that while the linear model is preferred, the trend break model still has some support (Burnham & Anderson, 2001).
+The BIC of the trend break model is also larger than the linear model, by `r signif(146.8385 - 140.7326, 2)`, indicating "positive" evidence for the linear model.
+Finally, the likelihood ratio test is:
+
+```{r trend_break_linear_lr_test}
+lr.test(idl_fit2, idl_fit3)
+```
+
+Because the p-value is larger than the customary .05 cutoff, we fail to reject the null model (the simpler linear model).
+Thus, under the more appropriate error assumptions of the GEV, the dataset plotted by Dong et al in their Figure 2a offers, if anything, positive support for a simpler linear model over the "trend break" model.
+
+# GRG data
+
+Next, we consider the evidence from the Gerontology Research Group.
+In our [earlier analysis](https://github.com/philippberens/lifespan/blob/master/analysis.md), this data had offered strong positive support for the authors' model over a linear one.
+In collaboration with [Adam Lenart](http://www.sdu.dk/staff/alenart), we recovered the full dataset from the Gerontology Research Group [website](http://www.grg.org/Adams/A.HTM), assuming that this was the source of the data the authors used.
+This is a dataset of "verified supercentenarians" as of January 1, 2014.
+
+```{r plot_mrad_grg}
+plt <- ggplot(mrad_grg, aes(x = Year, y = Age)) +
+  geom_point(alpha = 0.5) + 
+  stat_smooth(method = "loess") + 
+  ylab("Yearly maximum reported age at death (years)")
+  
+plt
+```
+
+The smoother in the above plot is a local polynomial regression.
+We have discarded years below 1950 in keeping with the authors' analysis (note their figure caption stating a lower bound year of 1972 is wrong given the data in the plot).
+
+Below, we fit and report the statistics for the stationary, linear and trend break models, as above.
+
+```{r stationary_grg}
+
+grg_fit1 <- fevd(mrad_grg$Age, mrad_grg)
+summary(grg_fit1)
+
+```
+
+As for the IDL dataset, the shape parameter is not distinguishable from zero, indicating a Gumbel distribution is approximately appropriate.
+
+```{r linear_model_grg}
+grg_fit2 <- fevd(mrad_grg$Age, mrad_grg, location.fun = ~ Year)  # linear increase
+summary(grg_fit2)
+lr.test(grg_fit1, grg_fit2)
+```
+
+Here, the linear model is a much better fit to the data (by AIC, BIC and LR test) than the stationary model.
+
+```{r trendbreak_model_grg}
+grg_fit3 <- fevd(mrad_grg$Age, mrad_grg, 
+                 location.fun = ~ Year * Group)  # trend break
+summary(grg_fit3)
+lr.test(grg_fit2, grg_fit3)
+```
+
+Here, the trend break model is a much better fit to the data than the linear model (by AIC, BIC and LR test). 
+
+```{r plot_trendbreak_grg, fig.width=8, fig.height=7}
+plot(grg_fit3)
+```
+
+Diagnostic plots of the model show very good fits.
+
+
+# Conclusion
+
+Even considering a model of the data using the more appropriate extreme value analysis (block maxima as a Generalised Extreme Value distribution), the GRG dataset supports the authors' trend break model over a simple linear increase in maximum age as a function of year.
+However, the data plotted in the authors' Fig 2a if anything support the linear model.
+
+
+
+# References
+
+* Burnham, K. P., & Anderson, D. R. (2002): Model selection and multimodel inference a practical information-theoretic approach. New York: Springer. 
+
+* Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. London: Springer London. 
+
+* Gilleland, E., & Katz, R. W. (2016). extRemes 2.0: An Extreme Value Analysis Package in R. Journal of Statistical Software, 72(8). 
+
+* Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 111–163.