zi Gaussian lfMsAbund

biodiverse · Mar 29, 2024 · c7841aa · c7841aa
1 parent 11873c7
commit c7841aa
Show file tree

Hide file tree

Showing 5 changed files with 32 additions and 28 deletions.
diff --git a/NEWS.md b/NEWS.md
@@ -2,6 +2,7 @@
 
 + Added in the `independent.betas` tag into the `priors` list for certain multi-species model types to allow for specifying an independent prior on the species-specific random effects as opposed to treating species-specific effects as random effects. This can be useful under certain circumstances when the distribution of effects across species may not be adequately represented by a Gaussian distribution. This tag is available for the following functions: `lfMsAbund` (Gaussian only), `sfMsAbund` (Gaussian only), and `svcMsAbund`.
 + Fixed a bug in zero-inflated Gaussian latent factor abundance models (`lfMsAbund`). 
++ Fixed a bug in `waicAbund()` that prevented it from working with `svcMsAbund` models.
 
 # spAbundance 0.1.1
 

diff --git a/R/waicAbund.R b/R/waicAbund.R
@@ -19,11 +19,11 @@ waicAbund <- function(object, N.max, by.species = FALSE, ...) {
 			     'msAbund', 'lfMsAbund', 
 			     'sfMsAbund', 'msNMix', 
 			     'lfMsNMix', 'sfMsNMix', 'DS', 'spDS', 
-			     'msDS', 'lfMsDS', 'sfMsDS', 'svcAbund'))) {
-    stop("error: object must be one of the following classes: abund, spAbund, NMix, spNMix, msAbund, lfMsAbund, sfMsAbund, msNMix, lfMsNMix, sfMsNMix, DS, spDS, msDS, lfMsDS, sfMsDS, svcAbund\n")
+			     'msDS', 'lfMsDS', 'sfMsDS', 'svcAbund', 'svcMsAbund'))) {
+    stop("error: object must be one of the following classes: abund, spAbund, NMix, spNMix, msAbund, lfMsAbund, sfMsAbund, msNMix, lfMsNMix, sfMsNMix, DS, spDS, msDS, lfMsDS, sfMsDS, svcAbund, svcMsAbund\n")
   }
 
-  if (!(class(object) %in% c('abund', 'spAbund', 'msAbund', 'lfMsAbund', 'sfMsAbund', 'svcAbund'))) {
+  if (!(class(object) %in% c('abund', 'spAbund', 'msAbund', 'lfMsAbund', 'sfMsAbund', 'svcAbund', 'svcMsAbund'))) {
     if (missing(N.max)) {
       message("N.max not specified. Setting upper index of integration of N to 10 plus\nthe largest estimated abundance value at each site in object$N.samples")
       if (class(object) %in% c('msNMix', 'lfMsNMix', 'sfMsNMix', 
@@ -86,7 +86,7 @@ waicAbund <- function(object, N.max, by.species = FALSE, ...) {
   }
 
   # Multi-species abundance GLMs ------------------------------------------
-  if (class(object) %in% c('msAbund', 'lfMsAbund', 'sfMsAbund')) {
+  if (class(object) %in% c('msAbund', 'lfMsAbund', 'sfMsAbund', 'svcMsAbund')) {
     if (object$dist %in% c('Gaussian', 'zi-Gaussian')) {
       if (object$dist == 'zi-Gaussian') {
         message("Calculated WAIC is only for stage 2 of the hurdle model\n")

diff --git a/src/lfMsAbundGaussian.cpp b/src/lfMsAbundGaussian.cpp
@@ -573,6 +573,8 @@ extern "C" {
            *mu
            *****************************/
           // zStar - X %*% beta
+	  zeros(tmp_J, J); 
+	  zeros(tmp_Jq, Jq);
           for (j = 0; j < J; j++) {
             if (z[j * N + i] == 1.0) {
               tmp_J[j] = y[j * N + i] - F77_NAME(ddot)(&p, &X[j], &J, &beta[i], &N) -
@@ -586,10 +588,10 @@ extern "C" {
 
           // S_beta %*% W' = tmp_Jq
           // aka multiply W[j, ] by omegaOcc[j] of the current species you're on. 
-          for (j = 0, l = 0; j < J; j++) {
+          for (j = 0; j < J; j++) {
             if (z[j * N + i] == 1.0) {
-              for (ll = 0; ll < q; ll++, l++) {
-                tmp_Jq[l] = w[j * q + ll] / tauSq[i];  
+              for (ll = 0; ll < q; ll++) {
+                tmp_Jq[j * q + ll] = w[j * q + ll] / tauSq[i];  
               }
 	    }
           }

diff --git a/vignettes/distanceSampling.Rmd b/vignettes/distanceSampling.Rmd
@@ -642,7 +642,7 @@ Now consider the case where distance sampling data, $\bm{y}_{i, j}$, are collect
      \text{log}(\mu_{i, j}) = \bm{x}_j^\top\bm{\beta}_i,
 \end{equation}
 
-where $\bm{\beta}_i$ are the species-specific effects of covariates $\bm{x}_j$ (including an intercept) . When $N_i(\bm{s}_j)$ is modeled using a negative binomial distribution, we estimate a separate dispersion parameter $\kappa_i$ for each species. We model $\bm{\beta}_i$ as random effects arising from a common, community-level normal distribution, which leads to increased precision of species-specific effects compared to single-species models [@sollmann2016hierarchical]. For example, the species-specific abundance intercept $\beta_{0, i}$ is modeled according to
+where $\bm{\beta}_i$ are the species-specific effects of covariates $\bm{x}_j$ (including an intercept). When $N_i(\bm{s}_j)$ is modeled using a negative binomial distribution, we estimate a separate dispersion parameter $\kappa_i$ for each species. We model $\bm{\beta}_i$ as random effects arising from a common, community-level normal distribution, which leads to increased precision of species-specific effects compared to single-species models [@sollmann2016hierarchical]. For example, the species-specific abundance intercept $\beta_{0, i}$ is modeled according to
 
 \begin{equation}\label{betaComm}
      \beta_{0, i} \sim \text{Normal}(\mu_{\beta_0}, \tau^2_{\beta_0}),
@@ -672,14 +672,14 @@ Notice these are the exact same arguments we saw with `DS()`. We will now model
 str(neonDWP)
 ```
 
-For multi-species models, the multi-species detection-nondetection data `y` is now a three-dimensional array with dimensions corresponding to species, sites, and replicates. This is how the data are provided in the `neonDWP` object, so we don't need to do any additional preparations. For guidance on preparing raw data into such a three-dimensional array format, please see [this vignette on the `spOccupancy` website](https://www.jeffdoser.com/files/spoccupancy-web/articles/dataformatting), which while providing guidance for fitting multi-species models in `spOccupancy`, the format for `spAbundance` is exactly the same and all the same guidance applies here. We will model abundance and detection for all species using the same covariates as before.
+For multi-species models, the multi-species detection-nondetection data `y` is now a three-dimensional array with dimensions corresponding to species, sites, and replicates. This is how the data are provided in the `neonDWP` object, so we don't need to do any additional preparations. For guidance on preparing raw data into such a three-dimensional array format, please see [this vignette on the `spOccupancy` website](https://www.jeffdoser.com/files/spoccupancy-web/articles/dataformatting). The vignette provides guidance for fitting multi-species models in `spOccupancy`, but the format for `spAbundance` is exactly the same and all the same guidance applies here. We will model abundance and detection for all species using the same covariates as before.
 
 ```{r}
 abund.formula <- ~ scale(forest) + scale(grass)
 det.formula <- ~ scale(wind)
 ```
 
-Next we specify the initial values in `inits`. For multi-species HDS models, we supply initial values for community-level and species-level parameters. In `msDS()`, we will supply initial values for the following parameters: `alpha.comm` (community-level detection coefficients), `beta.comm` (community-level abundance coefficients), `alpha` (species-level detection coefficients), `beta` (species-level abundance coefficients), `tau.sq.beta` (community-level abundance variance parameters), `tau.sq.alpha` (community-level detection variance parameters), `N` (latent abundance values for all species), `kappa` (species-level negative binomial overdispersion parameters), `sigma.sq.mu` (random effect variances for abundance), and `sigma.sq.p` (random effect variances for detection). These are all specified in a single list. Initial values for community-level parameters (including the random effect variances) are either vectors of length corresponding to the number of community-level detection or occurrence parameters in the model (including the intercepts) or a single value if all parameters are assigned the same initial values. Initial values for species level parameters are either matrices with the number of rows indicating the number of species, and each column corresponding to a different regression parameter, or a single value if the same initial value is used for all species and parameters. Initial values for `kappa` is similarly either a single value that is assumed to be the same for all species, or a vector with a specific initial value for each species. The initial values for the latent abundance matrix are specified as a matrix with $I$ rows corresponding to the number of species and $J$ columns corresponding to the number of sites.
+Next we specify the initial values in `inits`. For multi-species HDS models, we supply initial values for community-level and species-level parameters. In `msDS()`, we will supply initial values for the following parameters: `alpha.comm` (community-level detection coefficients), `beta.comm` (community-level abundance coefficients), `alpha` (species-level detection coefficients), `beta` (species-level abundance coefficients), `tau.sq.beta` (community-level abundance variance parameters), `tau.sq.alpha` (community-level detection variance parameters), `N` (latent abundance values for all species), `kappa` (species-level negative binomial overdispersion parameters), `sigma.sq.mu` (random effect variances for abundance), and `sigma.sq.p` (random effect variances for detection). These are all specified in a single list. Initial values for community-level parameters (including the random effect variances) are either vectors of length corresponding to the number of community-level detection or abundance parameters in the model (including the intercepts) or a single value if all parameters are assigned the same initial values. Initial values for species level parameters are either matrices with the number of rows indicating the number of species, and each column corresponding to a different regression parameter, or a single value if the same initial value is used for all species and parameters. Initial values for `kappa` is similarly either a single value that is assumed to be the same for all species, or a vector with a specific initial value for each species. The initial values for the latent abundance matrix are specified as a matrix with $I$ rows corresponding to the number of species and $J$ columns corresponding to the number of sites.
 
 ```{r}
 # Number of species