Skip to content

Commit

Permalink
replace <- with = in stan code to meet new restrictions
Browse files Browse the repository at this point in the history
  • Loading branch information
nicholasjclark committed Jul 14, 2022
1 parent c1ff95d commit 8501763
Show file tree
Hide file tree
Showing 24 changed files with 291 additions and 274 deletions.
Binary file removed .DS_Store
Binary file not shown.
Binary file removed NEON_manuscript/.DS_Store
Binary file not shown.
Binary file removed NEON_manuscript/Case studies/.DS_Store
Binary file not shown.
13 changes: 9 additions & 4 deletions NEON_manuscript/Case studies/mvgam_case_study1.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Generalised Additive Models (GAMs) are incredibly flexible tools that have found
We will work in the `mvgam` package, which fits dynamic GAMs (DGAMs) using MCMC sampling (note that either `JAGS` or `Stan` is required; installation links are found [here](https://sourceforge.net/projects/mcmc-jags/files/) and [here](https://mc-stan.org/users/interfaces/rstan)). Briefly, assume $\tilde{\boldsymbol{y}}_{t}$ is the conditional expectation of a discrete response variable $\boldsymbol{y}$ at time $\boldsymbol{t}$. Asssuming $\boldsymbol{y}$ is drawn from an exponential distribution (such as Poisson or Negative Binomial) with a log link function, the linear predictor for a dynamic GAM is written as:

$$log(\tilde{\boldsymbol{y}}_{t})={\boldsymbol{B}}_0+\sum\limits_{i=1}^I\boldsymbol{s}_{i,t}\boldsymbol{x}_{i,t}+\boldsymbol{z}_{t}\,,$$
Here $\boldsymbol{B}_{0}$ is the unknown intercept, the $\boldsymbol{s}$'s are unknown smooth functions of the covariates ($\boldsymbol{x}$'s) and $\boldsymbol{z}$ is a dynamic latent trend component. Each smooth function $\boldsymbol{s}_{i}$ is composed of spline like basis expansions whose coefficients, which must be estimated, control the shape of the functional relationship between $\boldsymbol{x}_{i}$ and $log(\tilde{\boldsymbol{y}})$. The size of the basis expansion limits the smooth’s potential complexity, with a larger set of basis functions allowing greater flexibility. Several advantages of GAMs are that they can model a diversity of response families, including discrete distributions (i.e. Poisson, Negative Binomial, Tweedie-Poisson) that accommodate common ecological features such as zero-inflation or overdispersion, and that they can be formulated to include hierarchical smoothing for multivariate responses. For the dynamic component, in its most basic form we assume a random walk with drift:
Here $\boldsymbol{B}_{0}$ is the unknown intercept, the $\boldsymbol{s}$'s are unknown smooth functions of the covariates ($\boldsymbol{x}$'s) and $\boldsymbol{z}$ is a dynamic latent trend component. Each smooth function $\boldsymbol{s}_{i}$ is composed of basis expansions whose coefficients, which must be estimated, control the shape of the functional relationship between $\boldsymbol{x}_{i}$ and $log(\tilde{\boldsymbol{y}})$. The size of the basis expansion limits the smooth’s potential complexity, with a larger set of basis functions allowing greater flexibility. Several advantages of GAMs are that they can model a diversity of response families, including discrete distributions (i.e. Poisson, Negative Binomial, Tweedie-Poisson) that accommodate common ecological features such as zero-inflation or overdispersion, and that they can be formulated to include hierarchical smoothing for multivariate responses. For the dynamic component, in its most basic form we assume a random walk with drift:

$$\boldsymbol{z}_{t}=\phi+\boldsymbol{z}_{t-1}+\boldsymbol{e}_{t}\,,$$
where $\phi$ is an optional drift parameter (if the latent trend is assumed to not be stationary) and $\boldsymbol{e}$ is drawn from a zero-centred Gaussian distribution. This model is easily modified to include autoregressive terms, which `mvgam` accomodates up to `order = 3`.
Expand All @@ -33,7 +33,7 @@ Dynamic GAMs are useful when we wish to predict future values from time series t
# Fit a model to the mcycle dataset in the MASS package, fixing the smoothing parameter at an abnormally large value
par(mfrow = c(2, 2),
mgp = c(2.5, 1, 0),
mai = c(0.6, 0.6, 0.2, 0.2))
mai=c(c(0.7, 0.7, 0.2, 0.2)))
library(mgcv)
data(mcycle, package = 'MASS')
Expand Down Expand Up @@ -113,7 +113,7 @@ We can view the summary of the `mgcv` model to see that the posterior estimates
summary(m_mgcv)
```

Forecasting from this model requires us to write a custom function that can recursively update the design matrix at each successive one-step ahead prediction. Here we define a function to iteratively propagate a forecast forward according to the AR3 observation model. Note that we have used an argument to allow the lags to be fed into the model on the log scale, this will make more sense in a moment as we inspect the forecasts of the model
Forecasting from this model requires us to write a custom function that can recursively update the design matrix (by filling in the values for the dynamic `lag(y)` covariates) at each successive one-step ahead prediction. Here we define a function to iteratively propagate a forecast forward according to the AR3 observation model. Note that we have used an argument to allow the lags to be fed into the model on the log scale, this will make more sense in a moment as we inspect the forecasts of the model

<details>
<summary>Click for definition of `recurse_ar3`</summary>
Expand Down Expand Up @@ -194,7 +194,7 @@ summary(m_mvgam)
```


The model file can also be viewed, which can be handy for making modifications to the model so that extra stochastic elements can be added to a user's bespoke model. Notice how the summary at the top of the model file describes the data elements (and their dimensions) that are needed to condition the model, again the hope is tht this simplifies the task of modifying the model for any additional user-specific requirements
The model file can also be viewed, which can be handy for making modifications to the model so that extra stochastic elements can be added to a user's bespoke model. Notice how the summary at the top of the model file describes the data elements (and their dimensions) that are needed to condition the model, again the hope is that this simplifies the task of modifying the model for any additional user-specific requirements
```{r}
m_mvgam$model_file
```
Expand Down Expand Up @@ -1114,3 +1114,8 @@ Both components contribute to forecast uncertainty, with the trend component con
```{r, fig.width=6, fig.height=6, fig.align='center'}
plot_mvgam_resids(lynx_mvgam, n_bins = 25)
```

This same plot can also be produced for out of sample residuals if a set of test data is supplied (note that the function will assume that the test data come immediately sequential to the training data, there are no formal checks of this so it is up to the user to ensure this holds)
```{r, fig.width=6, fig.height=6, fig.align='center'}
plot_mvgam_resids(lynx_mvgam, n_bins = 25, data_test = lynx_test)
```
191 changes: 100 additions & 91 deletions NEON_manuscript/Case studies/mvgam_case_study1.html

Large diffs are not rendered by default.

Binary file not shown.
Binary file removed NEON_manuscript/Case studies/rsconnect/.DS_Store
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,5 @@ hostUrl: rpubs.com
appId: https://api.rpubs.com/api/v1/document/856585/540d5299dc584aff9095b986b978429b
bundleId: https://api.rpubs.com/api/v1/document/856585/540d5299dc584aff9095b986b978429b
url: http://rpubs.com/NickClark47/856585
when: 1657508060.90589
lastSyncTime: 1657508060.90589
when: 1657751344.84995
lastSyncTime: 1657751344.84995
Binary file not shown.
Binary file not shown.
Binary file removed NEON_manuscript/Figures/.DS_Store
Binary file not shown.
Binary file removed NEON_manuscript/Manuscript/.DS_Store
Binary file not shown.
Binary file modified NEON_manuscript/Manuscript/Appendix_S1.pdf
Binary file not shown.
314 changes: 157 additions & 157 deletions NEON_manuscript/Manuscript/Appendix_S5_IxodesJAGS_Hyp3.txt
Original file line number Diff line number Diff line change
@@ -1,157 +1,157 @@
JAGS model code generated by package mvgam
GAM formula:
y ~ s(siteID, bs = re) + s(cum_gdd, k = 5) + s(cum_gdd, siteID, k = 5, m = 1, bs = fs) + s(season, m = 2, bs = cc, k = 12) + s(season, series, m = 1, k = 4, bs = fs)
Trend model:
RW
Required data:
integer n; number of timepoints per series
integer n_series; number of series
matrix y; time-ordered observations of dimension n x n_series (missing values allowed)
matrix ytimes; time-ordered n x n_series matrix (which row in X belongs to each [time, series] observation?)
matrix X; mgcv GAM design matrix of dimension (n x n_series) x basis dimension
matrix S2; mgcv smooth penalty matrix S2
matrix S4; mgcv smooth penalty matrix S4
vector zero; prior basis coefficient locations vector of length ncol(X)
vector p_coefs; vector (length = 1) of prior Gaussian means for parametric effects
vector p_taus; vector (length = 1) of prior Gaussian precisions for parametric effects
#### Begin model ####
model {
## GAM linear predictor
eta <- X %*% b
## mean expectations
for (i in 1:n) {
for (s in 1:n_series) {
mu[i, s] <- exp(eta[ytimes[i, s]] + trend[i, s])
}
}
## latent factors evolve as time series with penalised precisions;
## the penalty terms force any un-needed factors to evolve as flat lines
for (j in 1:n_lv) {
LV[1, j] ~ dnorm(0, penalty[j])
}
for (i in 2:n) {
for (j in 1:n_lv){
LV[i, j] ~ dnorm(LV[i - 1, j], penalty[j])
}
}
## shrinkage penalties for each factor's precision act to squeeze
## the entire factor toward a flat white noise process if supported by
## the data. The prior for individual factor penalties allows each factor to possibly
## have a relatively large penalty, which shrinks the prior for that factor's variance
## substantially. Penalties increase exponentially with the number of factors following
## Welty, Leah J., et al. Bayesian distributed lag models: estimating effects of particulate
## matter air pollution on daily mortality Biometrics 65.1 (2009): 282-291.
pi ~ dunif(0, n_lv)
X2 ~ dnorm(0, 1)T(0, )
# eta1 controls the baseline penalty
eta1 ~ dunif(-1, 1)
# eta2 controls how quickly the penalties exponentially increase
eta2 ~ dunif(-1, 1)
for (t in 1:n_lv) {
X1[t] ~ dnorm(0, 1)T(0, )
l.dist[t] <- max(t, pi[])
l.weight[t] <- exp(eta2[] * l.dist[t])
l.var[t] <- exp(eta1[] * l.dist[t] / 2) * 1
theta.prime[t] <- l.weight[t] * X1[t] + (1 - l.weight[t]) * X2[]
penalty[t] <- max(0.0001, theta.prime[t] * l.var[t])
}
## latent factor loadings: standard normal with identifiability constraints
## upper triangle of loading matrix set to zero
for (j in 1:(n_lv - 1)) {
for (j2 in (j + 1):n_lv) {
lv_coefs[j, j2] <- 0
}
}
## positive constraints on loading diagonals
for (j in 1:n_lv) {
lv_coefs[j, j] ~ dnorm(0, 1)T(0, 1);
}
## lower diagonal free
for (j in 2:n_lv) {
for (j2 in 1:(j - 1)) {
lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
}
}
## other elements also free
for (j in (n_lv + 1):n_series) {
for (j2 in 1:n_lv) {
lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
}
}
## trend evolution depends on latent factors
for (i in 1:n) {
for (s in 1:n_series) {
trend[i, s] <- inprod(lv_coefs[s,], LV[i,])
}
}
## likelihood functions
for (i in 1:n) {
for (s in 1:n_series) {
y[i, s] ~ dnegbin(rate[i, s], r[s])
rate[i, s] <- ifelse((r[s] / (r[s] + mu[i, s])) < min_eps, min_eps,
(r[s] / (r[s] + mu[i, s])))
}
}
## complexity penalising prior for the overdispersion parameter;
## where the likelihood reduces to a 'base' model (Poisson) unless
## the data support overdispersion
for (s in 1:n_series) {
r[s] <- 1 / r_raw[s]
r_raw[s] ~ dnorm(0, 0.1)T(0, )
}
## posterior predictions
for (i in 1:n) {
for (s in 1:n_series) {
ypred[i, s] ~ dnegbin(rate[i, s], r[s])
}
}
## GAM-specific priors
## parametric effect priors (regularised for identifiability)
for (i in 1:1) { b[i] ~ dnorm(p_coefs[i], p_taus[i]) }
## prior (non-centred) for s(siteID)...
for (i in 2:4) {
b_raw[i] ~ dnorm(0, 1)
b[i] <- mu_raw1 + b_raw[i] * sigma_raw1
}
sigma_raw1 ~ dexp(1)
mu_raw1 ~ dnorm(0, 1)
## prior for s(cum_gdd)...
K2 <- S2[1:4,1:4] * lambda[2] + S2[1:4,5:8] * lambda[3]
b[5:8] ~ dmnorm(zero[5:8],K2)
## prior for s(cum_gdd,siteID)...
for (i in c(9:12,14:17,19:22)) { b[i] ~ dnorm(0, lambda[4]) }
for (i in c(13,18,23)) { b[i] ~ dnorm(0, lambda[5]) }
## prior for s(season)...
K4 <- S4[1:10,1:10] * lambda[6]
b[24:33] ~ dmnorm(zero[24:33],K4)
## prior for s(season,series)...
for (i in c(34:36,38:40,42:44,46:48,50:52,54:56,58:60,62:64)) { b[i] ~ dnorm(0, lambda[7]) }
for (i in c(37,41,45,49,53,57,61,65)) { b[i] ~ dnorm(0, lambda[8]) }
## smoothing parameter priors...
for (i in 1:8) {
lambda[i] ~ dexp(0.05)T(0.0001, )
rho[i] <- log(lambda[i])
}
}
JAGS model code generated by package mvgam

GAM formula:
y ~ s(siteID, bs = re) + s(cum_gdd, k = 5) + s(cum_gdd, siteID, k = 5, m = 1, bs = fs) + s(season, m = 2, bs = cc, k = 12) + s(season, series, m = 1, k = 4, bs = fs)

Trend model:
RW

Required data:
integer n; number of timepoints per series
integer n_series; number of series
matrix y; time-ordered observations of dimension n x n_series (missing values allowed)
matrix ytimes; time-ordered n x n_series matrix (which row in X belongs to each [time, series] observation?)
matrix X; mgcv GAM design matrix of dimension (n x n_series) x basis dimension
matrix S2; mgcv smooth penalty matrix S2
matrix S4; mgcv smooth penalty matrix S4
vector zero; prior basis coefficient locations vector of length ncol(X)
vector p_coefs; vector (length = 1) of prior Gaussian means for parametric effects
vector p_taus; vector (length = 1) of prior Gaussian precisions for parametric effects


#### Begin model ####
model {

## GAM linear predictor
eta <- X %*% b

## mean expectations
for (i in 1:n) {
for (s in 1:n_series) {
mu[i, s] <- exp(eta[ytimes[i, s]] + trend[i, s])
}
}

## latent factors evolve as time series with penalised precisions;
## the penalty terms force any un-needed factors to evolve as flat lines
for (j in 1:n_lv) {
LV[1, j] ~ dnorm(0, penalty[j])
}

for (i in 2:n) {
for (j in 1:n_lv){
LV[i, j] ~ dnorm(LV[i - 1, j], penalty[j])
}
}

## shrinkage penalties for each factor's precision act to squeeze
## the entire factor toward a flat white noise process if supported by
## the data. The prior for individual factor penalties allows each factor to possibly
## have a relatively large penalty, which shrinks the prior for that factor's variance
## substantially. Penalties increase exponentially with the number of factors following
## Welty, Leah J., et al. Bayesian distributed lag models: estimating effects of particulate
## matter air pollution on daily mortality Biometrics 65.1 (2009): 282-291.
pi ~ dunif(0, n_lv)
X2 ~ dnorm(0, 1)T(0, )

# eta1 controls the baseline penalty
eta1 ~ dunif(-1, 1)

# eta2 controls how quickly the penalties exponentially increase
eta2 ~ dunif(-1, 1)

for (t in 1:n_lv) {
X1[t] ~ dnorm(0, 1)T(0, )
l.dist[t] <- max(t, pi[])
l.weight[t] <- exp(eta2[] * l.dist[t])
l.var[t] <- exp(eta1[] * l.dist[t] / 2) * 1
theta.prime[t] <- l.weight[t] * X1[t] + (1 - l.weight[t]) * X2[]
penalty[t] <- max(0.0001, theta.prime[t] * l.var[t])
}

## latent factor loadings: standard normal with identifiability constraints
## upper triangle of loading matrix set to zero
for (j in 1:(n_lv - 1)) {
for (j2 in (j + 1):n_lv) {
lv_coefs[j, j2] <- 0
}
}

## positive constraints on loading diagonals
for (j in 1:n_lv) {
lv_coefs[j, j] ~ dnorm(0, 1)T(0, 1);
}

## lower diagonal free
for (j in 2:n_lv) {
for (j2 in 1:(j - 1)) {
lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
}
}

## other elements also free
for (j in (n_lv + 1):n_series) {
for (j2 in 1:n_lv) {
lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
}
}

## trend evolution depends on latent factors
for (i in 1:n) {
for (s in 1:n_series) {
trend[i, s] <- inprod(lv_coefs[s,], LV[i,])
}
}

## likelihood functions
for (i in 1:n) {
for (s in 1:n_series) {
y[i, s] ~ dnegbin(rate[i, s], r[s])
rate[i, s] <- ifelse((r[s] / (r[s] + mu[i, s])) < min_eps, min_eps,
(r[s] / (r[s] + mu[i, s])))
}
}

## complexity penalising prior for the overdispersion parameter;
## where the likelihood reduces to a 'base' model (Poisson) unless
## the data support overdispersion
for (s in 1:n_series) {
r[s] <- 1 / r_raw[s]
r_raw[s] ~ dnorm(0, 0.1)T(0, )
}

## posterior predictions
for (i in 1:n) {
for (s in 1:n_series) {
ypred[i, s] ~ dnegbin(rate[i, s], r[s])
}
}

## GAM-specific priors
## parametric effect priors (regularised for identifiability)
for (i in 1:1) { b[i] ~ dnorm(p_coefs[i], p_taus[i]) }
## prior (non-centred) for s(siteID)...
for (i in 2:4) {
b_raw[i] ~ dnorm(0, 1)
b[i] <- mu_raw1 + b_raw[i] * sigma_raw1
}
sigma_raw1 ~ dexp(1)
mu_raw1 ~ dnorm(0, 1)
## prior for s(cum_gdd)...
K2 <- S2[1:4,1:4] * lambda[2] + S2[1:4,5:8] * lambda[3]
b[5:8] ~ dmnorm(zero[5:8],K2)
## prior for s(cum_gdd,siteID)...
for (i in c(9:12,14:17,19:22)) { b[i] ~ dnorm(0, lambda[4]) }
for (i in c(13,18,23)) { b[i] ~ dnorm(0, lambda[5]) }
## prior for s(season)...
K4 <- S4[1:10,1:10] * lambda[6]
b[24:33] ~ dmnorm(zero[24:33],K4)
## prior for s(season,series)...
for (i in c(34:36,38:40,42:44,46:48,50:52,54:56,58:60,62:64)) { b[i] ~ dnorm(0, lambda[7]) }
for (i in c(37,41,45,49,53,57,61,65)) { b[i] ~ dnorm(0, lambda[8]) }
## smoothing parameter priors...
for (i in 1:8) {
lambda[i] ~ dexp(0.05)T(0.0001, )
rho[i] <- log(lambda[i])
}
}
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
7 changes: 5 additions & 2 deletions NEON_manuscript/next_todo.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,14 @@ mod1 <- mvgam(formula = y ~ s(season) + s(series, bs = 're'),
run_model = FALSE)
mod1$model_file

mod2 <- mvgam(formula = y~s(season),
mod2 <- mvgam(formula = y~year,
data_train = dat$data_train,
trend_model = 'RW',
family = 'poisson',
run_model = TRUE)
run_model = TRUE,
use_stan = TRUE)
mod2$model_file

plot(mod2, 'smooths', residuals = TRUE, derivatives = TRUE)
compare_mvgams(model1 = mod1, model2 = mod2, fc_horizon = 6,
n_evaluations = 30, n_cores = 3)
Expand Down
Loading

0 comments on commit 8501763

Please sign in to comment.