replace <- with = in stan code to meet new restrictions

nicholasjclark · Jul 14, 2022 · 8501763 · 8501763
1 parent c1ff95d
commit 8501763
Show file tree

Hide file tree

Showing 24 changed files with 291 additions and 274 deletions.
diff --git a/.DS_Store b/.DS_Store
diff --git a/NEON_manuscript/.DS_Store b/NEON_manuscript/.DS_Store
diff --git a/NEON_manuscript/Case studies/.DS_Store b/NEON_manuscript/Case studies/.DS_Store
diff --git a/NEON_manuscript/Case studies/mvgam_case_study1.Rmd b/NEON_manuscript/Case studies/mvgam_case_study1.Rmd
@@ -21,7 +21,7 @@ Generalised Additive Models (GAMs) are incredibly flexible tools that have found
 We will work in the `mvgam` package, which fits dynamic GAMs (DGAMs) using MCMC sampling (note that either `JAGS` or `Stan` is required; installation links are found [here](https://sourceforge.net/projects/mcmc-jags/files/) and [here](https://mc-stan.org/users/interfaces/rstan)). Briefly, assume $\tilde{\boldsymbol{y}}_{t}$ is the conditional expectation of a discrete response variable $\boldsymbol{y}$ at time $\boldsymbol{t}$. Asssuming $\boldsymbol{y}$ is drawn from an exponential distribution (such as Poisson or Negative Binomial) with a log link function, the linear predictor for a dynamic GAM is written as:
 
 $$log(\tilde{\boldsymbol{y}}_{t})={\boldsymbol{B}}_0+\sum\limits_{i=1}^I\boldsymbol{s}_{i,t}\boldsymbol{x}_{i,t}+\boldsymbol{z}_{t}\,,$$
-Here $\boldsymbol{B}_{0}$ is the unknown intercept, the $\boldsymbol{s}$'s are unknown smooth functions of the covariates ($\boldsymbol{x}$'s) and $\boldsymbol{z}$ is a dynamic latent trend component. Each smooth function $\boldsymbol{s}_{i}$ is composed of spline like basis expansions whose coefficients, which must be estimated, control the shape of the functional relationship between $\boldsymbol{x}_{i}$ and $log(\tilde{\boldsymbol{y}})$. The size of the basis expansion limits the smooth’s potential complexity, with a larger set of basis functions allowing greater flexibility. Several advantages of GAMs are that they can model a diversity of response families, including discrete distributions (i.e. Poisson, Negative Binomial, Tweedie-Poisson) that accommodate common ecological features such as zero-inflation or overdispersion, and that they can be formulated to include hierarchical smoothing for multivariate responses. For the dynamic component, in its most basic form we assume a random walk with drift:
+Here $\boldsymbol{B}_{0}$ is the unknown intercept, the $\boldsymbol{s}$'s are unknown smooth functions of the covariates ($\boldsymbol{x}$'s) and $\boldsymbol{z}$ is a dynamic latent trend component. Each smooth function $\boldsymbol{s}_{i}$ is composed of basis expansions whose coefficients, which must be estimated, control the shape of the functional relationship between $\boldsymbol{x}_{i}$ and $log(\tilde{\boldsymbol{y}})$. The size of the basis expansion limits the smooth’s potential complexity, with a larger set of basis functions allowing greater flexibility. Several advantages of GAMs are that they can model a diversity of response families, including discrete distributions (i.e. Poisson, Negative Binomial, Tweedie-Poisson) that accommodate common ecological features such as zero-inflation or overdispersion, and that they can be formulated to include hierarchical smoothing for multivariate responses. For the dynamic component, in its most basic form we assume a random walk with drift:
 
 $$\boldsymbol{z}_{t}=\phi+\boldsymbol{z}_{t-1}+\boldsymbol{e}_{t}\,,$$
 where $\phi$ is an optional drift parameter (if the latent trend is assumed to not be stationary) and $\boldsymbol{e}$ is drawn from a zero-centred Gaussian distribution. This model is easily modified to include autoregressive terms, which `mvgam` accomodates up to `order = 3`.
@@ -33,7 +33,7 @@ Dynamic GAMs are useful when we wish to predict future values from time series t
 # Fit a model to the mcycle dataset in the MASS package, fixing the smoothing parameter at an abnormally large value
 par(mfrow = c(2, 2),
     mgp = c(2.5, 1, 0),
-    mai = c(0.6, 0.6, 0.2, 0.2))
+    mai=c(c(0.7, 0.7, 0.2, 0.2)))
 library(mgcv)
 data(mcycle, package = 'MASS')
 
@@ -113,7 +113,7 @@ We can view the summary of the `mgcv` model to see that the posterior estimates
 summary(m_mgcv)
 ```
 
-Forecasting from this model requires us to write a custom function that can recursively update the design matrix at each successive one-step ahead prediction. Here we define a function to iteratively propagate a forecast forward according to the AR3 observation model. Note that we have used an argument to allow the lags to be fed into the model on the log scale, this will make more sense in a moment as we inspect the forecasts of the model
+Forecasting from this model requires us to write a custom function that can recursively update the design matrix (by filling in the values for the dynamic `lag(y)` covariates) at each successive one-step ahead prediction. Here we define a function to iteratively propagate a forecast forward according to the AR3 observation model. Note that we have used an argument to allow the lags to be fed into the model on the log scale, this will make more sense in a moment as we inspect the forecasts of the model
 
 <details>
 <summary>Click for definition of `recurse_ar3`</summary>
@@ -194,7 +194,7 @@ summary(m_mvgam)
 ```
 
 
-The model file can also be viewed, which can be handy for making modifications to the model so that extra stochastic elements can be added to a user's bespoke model. Notice how the summary at the top of the model file describes the data elements (and their dimensions) that are needed to condition the model, again the hope is tht this simplifies the task of modifying the model for any additional user-specific requirements
+The model file can also be viewed, which can be handy for making modifications to the model so that extra stochastic elements can be added to a user's bespoke model. Notice how the summary at the top of the model file describes the data elements (and their dimensions) that are needed to condition the model, again the hope is that this simplifies the task of modifying the model for any additional user-specific requirements
 ```{r}
 m_mvgam$model_file
 ```
@@ -1114,3 +1114,8 @@ Both components contribute to forecast uncertainty, with the trend component con
 ```{r, fig.width=6, fig.height=6, fig.align='center'}
 plot_mvgam_resids(lynx_mvgam, n_bins = 25)
 ```
+
+This same plot can also be produced for out of sample residuals if a set of test data is supplied (note that the function will assume that the test data come immediately sequential to the training data, there are no formal checks of this so it is up to the user to ensure this holds)
+```{r, fig.width=6, fig.height=6, fig.align='center'}
+plot_mvgam_resids(lynx_mvgam, n_bins = 25, data_test = lynx_test)
+```
diff --git a/NEON_manuscript/Case studies/mvgam_case_study1.html b/NEON_manuscript/Case studies/mvgam_case_study1.html
diff --git a/NEON_manuscript/Case studies/mvgam_case_study2_files/.DS_Store b/NEON_manuscript/Case studies/mvgam_case_study2_files/.DS_Store
diff --git a/NEON_manuscript/Case studies/rsconnect/.DS_Store b/NEON_manuscript/Case studies/rsconnect/.DS_Store
diff --git a/NEON_manuscript/Case studies/rsconnect/documents/.DS_Store b/NEON_manuscript/Case studies/rsconnect/documents/.DS_Store
diff --git a/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study1.Rmd/.DS_Store b/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study1.Rmd/.DS_Store
diff --git a/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study1.Rmd/rpubs.com/.DS_Store b/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study1.Rmd/rpubs.com/.DS_Store
diff --git a/...se studies/rsconnect/documents/mvgam_case_study1.Rmd/rpubs.com/rpubs/Publish Document.dcf b/...se studies/rsconnect/documents/mvgam_case_study1.Rmd/rpubs.com/rpubs/Publish Document.dcf
@@ -7,5 +7,5 @@ hostUrl: rpubs.com
 appId: https://api.rpubs.com/api/v1/document/856585/540d5299dc584aff9095b986b978429b
 bundleId: https://api.rpubs.com/api/v1/document/856585/540d5299dc584aff9095b986b978429b
 url: http://rpubs.com/NickClark47/856585
-when: 1657508060.90589
-lastSyncTime: 1657508060.90589
+when: 1657751344.84995
+lastSyncTime: 1657751344.84995
diff --git a/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study2.Rmd/.DS_Store b/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study2.Rmd/.DS_Store
diff --git a/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study2.Rmd/rpubs.com/.DS_Store b/NEON_manuscript/Case studies/rsconnect/documents/mvgam_case_study2.Rmd/rpubs.com/.DS_Store
diff --git a/NEON_manuscript/Figures/.DS_Store b/NEON_manuscript/Figures/.DS_Store
diff --git a/NEON_manuscript/Manuscript/.DS_Store b/NEON_manuscript/Manuscript/.DS_Store
diff --git a/NEON_manuscript/Manuscript/Appendix_S1.pdf b/NEON_manuscript/Manuscript/Appendix_S1.pdf
diff --git a/NEON_manuscript/Manuscript/Appendix_S5_IxodesJAGS_Hyp3.txt b/NEON_manuscript/Manuscript/Appendix_S5_IxodesJAGS_Hyp3.txt
@@ -1,157 +1,157 @@
-JAGS model code generated by package mvgam
-
-GAM formula:
-y ~ s(siteID, bs = re) + s(cum_gdd, k = 5) + s(cum_gdd, siteID, k = 5, m = 1, bs = fs) + s(season, m = 2, bs = cc, k = 12) + s(season, series, m = 1, k = 4, bs = fs)
-
-Trend model:
-RW
-
-Required data:
-integer n;  number of timepoints per series
-integer n_series;  number of series
-matrix y;  time-ordered observations of dimension n x n_series (missing values allowed)
-matrix ytimes;  time-ordered n x n_series matrix (which row in X belongs to each [time, series] observation?)
-matrix X;  mgcv GAM design matrix of dimension (n x n_series) x basis dimension
-matrix S2;  mgcv smooth penalty matrix S2
-matrix S4;  mgcv smooth penalty matrix S4
-vector zero;  prior basis coefficient locations vector of length ncol(X)
-vector p_coefs;  vector (length = 1) of prior Gaussian means for parametric effects
-vector p_taus;  vector (length = 1) of prior Gaussian precisions for parametric effects
-
-
-#### Begin model ####
-model {
-
-## GAM linear predictor
-eta <- X %*% b
-
-## mean expectations
-for (i in 1:n) {
-for (s in 1:n_series) {
-mu[i, s] <- exp(eta[ytimes[i, s]] + trend[i, s])
-}
-}
-
-## latent factors evolve as time series with penalised precisions;
-## the penalty terms force any un-needed factors to evolve as flat lines
-for (j in 1:n_lv) {
-LV[1, j] ~ dnorm(0, penalty[j])
-}
-
-for (i in 2:n) {
-for (j in 1:n_lv){
-LV[i, j] ~ dnorm(LV[i - 1, j], penalty[j])
-}
-}
-
-## shrinkage penalties for each factor's precision act to squeeze
-## the entire factor toward a flat white noise process if supported by
-## the data. The prior for individual factor penalties allows each factor to possibly
-## have a relatively large penalty, which shrinks the prior for that factor's variance
-## substantially. Penalties increase exponentially with the number of factors following
-## Welty, Leah J., et al. Bayesian distributed lag models: estimating effects of particulate
-## matter air pollution on daily mortality Biometrics 65.1 (2009): 282-291.
-pi ~ dunif(0, n_lv)
-X2 ~ dnorm(0, 1)T(0, )
-
-# eta1 controls the baseline penalty
-eta1 ~ dunif(-1, 1)
-
-# eta2 controls how quickly the penalties exponentially increase
-eta2 ~ dunif(-1, 1)
-
-for (t in 1:n_lv) {
-X1[t] ~ dnorm(0, 1)T(0, )
-l.dist[t] <- max(t, pi[])
-l.weight[t] <- exp(eta2[] * l.dist[t])
-l.var[t] <- exp(eta1[] * l.dist[t] / 2) * 1
-theta.prime[t] <- l.weight[t] * X1[t] + (1 - l.weight[t]) * X2[]
-penalty[t] <- max(0.0001, theta.prime[t] * l.var[t])
-}
-
-## latent factor loadings: standard normal with identifiability constraints
-## upper triangle of loading matrix set to zero
-for (j in 1:(n_lv - 1)) {
-for (j2 in (j + 1):n_lv) {
-lv_coefs[j, j2] <- 0
-}
-}
-
-## positive constraints on loading diagonals
-for (j in 1:n_lv) {
-lv_coefs[j, j] ~ dnorm(0, 1)T(0, 1);
-}
-
-## lower diagonal free
-for (j in 2:n_lv) {
-for (j2 in 1:(j - 1)) {
-lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
-}
-}
-
-## other elements also free
-for (j in (n_lv + 1):n_series) {
-for (j2 in 1:n_lv) {
-lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
-}
-}
-
-## trend evolution depends on latent factors
-for (i in 1:n) {
-for (s in 1:n_series) {
-trend[i, s] <- inprod(lv_coefs[s,], LV[i,])
-}
-}
-
-## likelihood functions
-for (i in 1:n) {
-for (s in 1:n_series) {
-y[i, s] ~ dnegbin(rate[i, s], r[s])
-rate[i, s] <- ifelse((r[s] / (r[s] + mu[i, s])) < min_eps, min_eps,
-(r[s] / (r[s] + mu[i, s])))
-}
-}
-
-## complexity penalising prior for the overdispersion parameter;
-## where the likelihood reduces to a 'base' model (Poisson) unless
-## the data support overdispersion
-for (s in 1:n_series) {
-r[s] <- 1 / r_raw[s]
-r_raw[s] ~ dnorm(0, 0.1)T(0, )
-}
-
-## posterior predictions
-for (i in 1:n) {
-for (s in 1:n_series) {
-ypred[i, s] ~ dnegbin(rate[i, s], r[s])
-}
-}
-
-## GAM-specific priors
-## parametric effect priors (regularised for identifiability)
-for (i in 1:1) { b[i] ~ dnorm(p_coefs[i], p_taus[i]) }
-## prior (non-centred) for s(siteID)...
-for (i in 2:4) {
-b_raw[i] ~ dnorm(0, 1)
-b[i] <- mu_raw1 + b_raw[i] * sigma_raw1
-}
-sigma_raw1 ~ dexp(1)
-mu_raw1 ~ dnorm(0, 1)
-## prior for s(cum_gdd)...
-K2 <- S2[1:4,1:4] * lambda[2]  + S2[1:4,5:8] * lambda[3]
-b[5:8] ~ dmnorm(zero[5:8],K2)
-## prior for s(cum_gdd,siteID)...
-for (i in c(9:12,14:17,19:22)) { b[i] ~ dnorm(0, lambda[4]) }
-for (i in c(13,18,23)) { b[i] ~ dnorm(0, lambda[5]) }
-## prior for s(season)...
-K4 <- S4[1:10,1:10] * lambda[6]
-b[24:33] ~ dmnorm(zero[24:33],K4)
-## prior for s(season,series)...
-for (i in c(34:36,38:40,42:44,46:48,50:52,54:56,58:60,62:64)) { b[i] ~ dnorm(0, lambda[7]) }
-for (i in c(37,41,45,49,53,57,61,65)) { b[i] ~ dnorm(0, lambda[8]) }
-## smoothing parameter priors...
-for (i in 1:8) {
-lambda[i] ~ dexp(0.05)T(0.0001, )
-rho[i] <- log(lambda[i])
-}
-}
+JAGS model code generated by package mvgam
+
+GAM formula:
+y ~ s(siteID, bs = re) + s(cum_gdd, k = 5) + s(cum_gdd, siteID, k = 5, m = 1, bs = fs) + s(season, m = 2, bs = cc, k = 12) + s(season, series, m = 1, k = 4, bs = fs)
+
+Trend model:
+RW
+
+Required data:
+integer n;  number of timepoints per series
+integer n_series;  number of series
+matrix y;  time-ordered observations of dimension n x n_series (missing values allowed)
+matrix ytimes;  time-ordered n x n_series matrix (which row in X belongs to each [time, series] observation?)
+matrix X;  mgcv GAM design matrix of dimension (n x n_series) x basis dimension
+matrix S2;  mgcv smooth penalty matrix S2
+matrix S4;  mgcv smooth penalty matrix S4
+vector zero;  prior basis coefficient locations vector of length ncol(X)
+vector p_coefs;  vector (length = 1) of prior Gaussian means for parametric effects
+vector p_taus;  vector (length = 1) of prior Gaussian precisions for parametric effects
+
+
+#### Begin model ####
+model {
+
+## GAM linear predictor
+eta <- X %*% b
+
+## mean expectations
+for (i in 1:n) {
+for (s in 1:n_series) {
+mu[i, s] <- exp(eta[ytimes[i, s]] + trend[i, s])
+}
+}
+
+## latent factors evolve as time series with penalised precisions;
+## the penalty terms force any un-needed factors to evolve as flat lines
+for (j in 1:n_lv) {
+LV[1, j] ~ dnorm(0, penalty[j])
+}
+
+for (i in 2:n) {
+for (j in 1:n_lv){
+LV[i, j] ~ dnorm(LV[i - 1, j], penalty[j])
+}
+}
+
+## shrinkage penalties for each factor's precision act to squeeze
+## the entire factor toward a flat white noise process if supported by
+## the data. The prior for individual factor penalties allows each factor to possibly
+## have a relatively large penalty, which shrinks the prior for that factor's variance
+## substantially. Penalties increase exponentially with the number of factors following
+## Welty, Leah J., et al. Bayesian distributed lag models: estimating effects of particulate
+## matter air pollution on daily mortality Biometrics 65.1 (2009): 282-291.
+pi ~ dunif(0, n_lv)
+X2 ~ dnorm(0, 1)T(0, )
+
+# eta1 controls the baseline penalty
+eta1 ~ dunif(-1, 1)
+
+# eta2 controls how quickly the penalties exponentially increase
+eta2 ~ dunif(-1, 1)
+
+for (t in 1:n_lv) {
+X1[t] ~ dnorm(0, 1)T(0, )
+l.dist[t] <- max(t, pi[])
+l.weight[t] <- exp(eta2[] * l.dist[t])
+l.var[t] <- exp(eta1[] * l.dist[t] / 2) * 1
+theta.prime[t] <- l.weight[t] * X1[t] + (1 - l.weight[t]) * X2[]
+penalty[t] <- max(0.0001, theta.prime[t] * l.var[t])
+}
+
+## latent factor loadings: standard normal with identifiability constraints
+## upper triangle of loading matrix set to zero
+for (j in 1:(n_lv - 1)) {
+for (j2 in (j + 1):n_lv) {
+lv_coefs[j, j2] <- 0
+}
+}
+
+## positive constraints on loading diagonals
+for (j in 1:n_lv) {
+lv_coefs[j, j] ~ dnorm(0, 1)T(0, 1);
+}
+
+## lower diagonal free
+for (j in 2:n_lv) {
+for (j2 in 1:(j - 1)) {
+lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
+}
+}
+
+## other elements also free
+for (j in (n_lv + 1):n_series) {
+for (j2 in 1:n_lv) {
+lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);
+}
+}
+
+## trend evolution depends on latent factors
+for (i in 1:n) {
+for (s in 1:n_series) {
+trend[i, s] <- inprod(lv_coefs[s,], LV[i,])
+}
+}
+
+## likelihood functions
+for (i in 1:n) {
+for (s in 1:n_series) {
+y[i, s] ~ dnegbin(rate[i, s], r[s])
+rate[i, s] <- ifelse((r[s] / (r[s] + mu[i, s])) < min_eps, min_eps,
+(r[s] / (r[s] + mu[i, s])))
+}
+}
+
+## complexity penalising prior for the overdispersion parameter;
+## where the likelihood reduces to a 'base' model (Poisson) unless
+## the data support overdispersion
+for (s in 1:n_series) {
+r[s] <- 1 / r_raw[s]
+r_raw[s] ~ dnorm(0, 0.1)T(0, )
+}
+
+## posterior predictions
+for (i in 1:n) {
+for (s in 1:n_series) {
+ypred[i, s] ~ dnegbin(rate[i, s], r[s])
+}
+}
+
+## GAM-specific priors
+## parametric effect priors (regularised for identifiability)
+for (i in 1:1) { b[i] ~ dnorm(p_coefs[i], p_taus[i]) }
+## prior (non-centred) for s(siteID)...
+for (i in 2:4) {
+b_raw[i] ~ dnorm(0, 1)
+b[i] <- mu_raw1 + b_raw[i] * sigma_raw1
+}
+sigma_raw1 ~ dexp(1)
+mu_raw1 ~ dnorm(0, 1)
+## prior for s(cum_gdd)...
+K2 <- S2[1:4,1:4] * lambda[2]  + S2[1:4,5:8] * lambda[3]
+b[5:8] ~ dmnorm(zero[5:8],K2)
+## prior for s(cum_gdd,siteID)...
+for (i in c(9:12,14:17,19:22)) { b[i] ~ dnorm(0, lambda[4]) }
+for (i in c(13,18,23)) { b[i] ~ dnorm(0, lambda[5]) }
+## prior for s(season)...
+K4 <- S4[1:10,1:10] * lambda[6]
+b[24:33] ~ dmnorm(zero[24:33],K4)
+## prior for s(season,series)...
+for (i in c(34:36,38:40,42:44,46:48,50:52,54:56,58:60,62:64)) { b[i] ~ dnorm(0, lambda[7]) }
+for (i in c(37,41,45,49,53,57,61,65)) { b[i] ~ dnorm(0, lambda[8]) }
+## smoothing parameter priors...
+for (i in 1:8) {
+lambda[i] ~ dexp(0.05)T(0.0001, )
+rho[i] <- log(lambda[i])
+}
+}
diff --git a/NEON_manuscript/Manuscript/Clark_etal_2021_Dynamic_GAMs_rev1.docx b/NEON_manuscript/Manuscript/Clark_etal_2021_Dynamic_GAMs_rev1.docx
diff --git a/NEON_manuscript/Manuscript/Revision 1/Clark_etal_2022_Dynamic_GAMs_rev1_clean.docx b/NEON_manuscript/Manuscript/Revision 1/Clark_etal_2022_Dynamic_GAMs_rev1_clean.docx
diff --git a/NEON_manuscript/Manuscript/Revision 1/Clark_etal_2022_Dynamic_GAMs_rev1_tracked.docx b/NEON_manuscript/Manuscript/Revision 1/Clark_etal_2022_Dynamic_GAMs_rev1_tracked.docx
diff --git a/NEON_manuscript/Manuscript/Revision 1/Clark_etal_responses_reviews.docx b/NEON_manuscript/Manuscript/Revision 1/Clark_etal_responses_reviews.docx
diff --git a/NEON_manuscript/next_todo.R b/NEON_manuscript/next_todo.R
@@ -8,11 +8,14 @@ mod1 <- mvgam(formula = y ~ s(season) + s(series, bs = 're'),
               run_model = FALSE)
 mod1$model_file
 
-mod2 <- mvgam(formula = y~s(season),
+mod2 <- mvgam(formula = y~year,
               data_train = dat$data_train,
               trend_model = 'RW',
               family = 'poisson',
-              run_model = TRUE)
+              run_model = TRUE,
+              use_stan = TRUE)
+mod2$model_file
+
 plot(mod2, 'smooths', residuals = TRUE, derivatives = TRUE)
 compare_mvgams(model1 = mod1, model2 = mod2, fc_horizon = 6,
                n_evaluations = 30, n_cores = 3)