Added capabiltiy for various multiple testing corrections

Added `p_correct` arugment to spatial_power() and jitter_power() Calls a new, internal function pval_correct() that calculates three types of corrections (FDR, Sidak, Bonferroni) spatial_power() and jitter_power() now report the (un)corrected critical p-value of each iteration Updated README with new example that uses a Bonferroni correction. Add one plot image. Added pval_correct() to the functions table
machiela-lab · Jan 21, 2021 · b85e417 · b85e417
1 parent aac2016
commit b85e417
Show file tree

Hide file tree

Showing 11 changed files with 246 additions and 20 deletions.
diff --git a/NAMESPACE b/NAMESPACE
@@ -1,6 +1,7 @@
 # Generated by roxygen2: do not edit by hand
 
 export(jitter_power)
+export(pval_correct)
 export(spatial_data)
 export(spatial_plots)
 export(spatial_power)

diff --git a/NEWS.md b/NEWS.md
@@ -9,6 +9,7 @@
 * Changes to `spatial_data()`
   * Used `spatstat.core::as.solist()` to specify appropriate class to lists
 * Changes to `spatial_power()`
+  * Added `p_correct` arugment to apply a multiple testing correction
   * Fixed typos in documentation
   * Moved `cascon` argument from `spatial_power()` to `spatial_plots()`
   * Estimates case-only power (lower tail) and case/control (lower and upper tail) and captures both outputs. Case-only is now a one-tailed hypothesis test by default. The arguments 'lower_tail' and 'upper_tail' have been replaced with argument 'alpha'
@@ -19,16 +20,18 @@
   * Removed main panel title in plots
   * Fixed default colors and removed mislabled 'midpoint' color
   * Removed annotation from `plot.ppp()` in plot #3 and plot #2 (if points are not plotted)
-  * Added `scale` argument to scale text for higher resolution plots
+  * Added `scale` arguement to scale text for higher resolution plots
   * Added `horizontal` argument to toggle the display of the color key to be on the right (vertical) or bottom (horizontal) of plots
   * Added `plot_axes` argument to toggle the display of axes in plots
   * Added `plot_square` argument to toggle the margins of plots
   * Changed the value of `cex.axis` and `cex` in plots
 * Changes to `jitter_power()`
+  * Added `p_correct` arugment to apply a multiple testing correction
   * Specify all packages for functions
   * Moved `cascon` argument from `jitter_power()` to `spatial_plots()`
   * Estimates case-only power (lower tail) and case/control (lower and upper tail) and captures both outputs. Case-only is now a one-tailed hypothesis test by default. The arguments 'lower_tail' and 'upper_tail' have been replaced with argument 'alpha'
   * Used `spatstat.core::as.solist()` to specify appropriate class to lists
+  * Updated example to reflect new updates
 * Changes to vignette
   * Set global chunk options
   * Named code chunks

diff --git a/R/jitter_power.R b/R/jitter_power.R
@@ -7,6 +7,7 @@
 #' @param samp_control Character string specifying whether to randomize the control locations uniformly (\code{samp_control="uniform"}), with complete spatial randomness (\code{samp_control="CSR"}), or multivariate normal (\code{samp_control="MVN"}).
 #' @param s_control Optional. Numeric value for the standard deviation of the multivariate normal distribution in the units of the \code{obs_data}.  The default value (1) assumes a unit square window. Ignored if Ignored if \code{samp_control="uniform"} or \code{samp_control="CSR"}.
 #' @param alpha Optional. Numeric value of the critical p-value (default=0.05).
+#' @param p_correct Optional. Character string specifying whether to apply a correction for multiple comparisons including a False Discovery Rate \code{p_correct = "FDR"}, a Sidak correction \code{p_correct = "uncorrelated Sidak"}, and a Bonferroni correction \code{p_correct = "uncorrelated Bonferroni"}. If \code{p_correct = "none"} (the default), then no correction is applied. 
 #' @param parallel Logical. If TRUE, will execute the function in parallel. If FALSE (the default), will not execute the function in parallel.
 #' @param n_core Optional. Integer specifying the number of CPU cores on current host to use for parallelization (the default is 2 cores).
 #' @param verbose Logical. If TRUE (the default), will print function progress during execution. If FALSE, will not print.
@@ -23,6 +24,8 @@
 #' If \code{samp_control = "MVN"} the control locations are randomly generated assuming a multivariate normal distribution \emph{centered at each observed location}. The optional argument \code{s_control} specifies the standard deviation of the multivariate normal distribution (1 by default) in the units of the \code{obs_data}. 
 #' 
 #' The function computes a one-sided hypothesis test for case clustering (\code{alpha = 0.05} by default). The function also computes a two-sided hypothesis test for case clustering and control clustering (lower tail = 0.025 and upper tail = 0.975).
+#' 
+#' The function has functionality for a correction for multiple testing. If \code{p_correct = "FDR"}, calculates a False Discovery Rate by Benjamini and Hochberg. If \code{p_correct = "Sidak"}, calculates a Sidak correction. If \code{p_correct = "Bonferroni"}, calculates a Bonferroni correction. If \code{p_correct = "none"} (the default), then the function does not account for multiple testing and uses the uncorrected \code{alpha} level. See the internal \code{pval_correct} function documentation for more details.
 #'
 #' @return An object of class "list". This is a named list with the following components:
 #' 
@@ -41,6 +44,7 @@
 #' \item{\code{bandw}}{Vector of length \code{sim_total} of the bandwidth (of numerator) used in each iteration.}
 #' \item{\code{s_obs}}{Vector of length \code{sim_total} of the global s statistic.}
 #' \item{\code{t_obs}}{Vector of length \code{sim_total} of the global t statistic.}
+#' \item{\code{alpha}}{Vector of length \code{sim_total} of the (un)corrected critical p-values.}
 #' }
 #' 
 #' @importFrom doParallel registerDoParallel
@@ -59,16 +63,16 @@
 #' # Using the \code{\link[spatstat.data]{chorley}} dataset
 #' data(chorley)
 #'  f1 <- jitter_power(obs_data = unique(chorley),
-#'                     sim_total = 2,
-#'                     samp_control = "MVN",
-#'                     s_control = 0.01,
+#'                     sim_total = 10,
+#'                     samp_control = "CSR",
 #'                     verbose = FALSE)
 #' 
 jitter_power <- function(obs_data,
                          sim_total,
                          samp_control = c("uniform", "CSR", "MVN"),
                          s_control = 1,
                          alpha = 0.05,
+                         p_correct = c("none", "FDR", "Sidak", "Bonferroni"),
                          parallel = FALSE,
                          n_core = 2,
                          verbose = TRUE,
@@ -79,6 +83,9 @@ jitter_power <- function(obs_data,
     stop("Argument 'obs_data' must be of class 'ppp'")
   }
 
+  match.arg(p_correct, choices = c("none", "FDR", "Sidak", "Bonferroni"))
+  if (is.null(p_correct)) { p_correct <- "none"}
+
   # marked uniform ppp for controls
   rcluster_control <- function(n, l, win, s, types = "control", ...) {
     if (samp_control == "uniform") {
@@ -125,6 +132,7 @@ jitter_power <- function(obs_data,
                               .multicombine = TRUE, 
                               .packages = c("sparr", "spatstat.core", "utils"),
                               .init = list(list(), list(), list(),
+                                           list(), list(), list(),
                                            list(), list(), list(), 
                                            list(), list(), list(),
                                            list(), list())) %fun% {
@@ -164,6 +172,23 @@ jitter_power <- function(obs_data,
     sim_risk <- as.vector(t(obs_lrr$rr$v))
     sim_pval <- as.vector(t(obs_lrr$P$v))
 
+    if (p_correct != "none") {
+      alpha_correct <- pval_correct(input = obs_lrr, type = p_correct, alpha = alpha)
+
+      #### Case and Control (lower and upper tail)
+      lower_tail <- alpha_correct/2
+      upper_tail <- 1 - lower_tail
+      pval_sig_cascon <- sapply(sim_pval, function(x) ifelse(x < lower_tail | x > upper_tail,
+                                                             TRUE,
+                                                             FALSE))
+      #### Case only (lower tail only)
+      pval_sig_cas <- sapply(sim_pval, function(x) ifelse(x < alpha_correct, TRUE, FALSE))
+    } else {
+      alpha_correct <- alpha
+      pval_sig_cascon <- "Uncorrected"
+      pval_sig_cas <- "Uncorrected"
+    }
+
     ### Estimated global test statistics
     #### Global maximum relative risk: H0 = 1
     s_obs <- max(exp(obs_lrr$rr$v[!is.na(obs_lrr$rr$v)]))
@@ -194,7 +219,10 @@ jitter_power <- function(obs_data,
                         "n_con" = con$n,
                         "bandw" = obs_lrr$f$h0,
                         "s_obs" = s_obs,
-                        "t_obs" = t_obs)
+                        "t_obs" = t_obs,
+                        "alpha_correct" = alpha_correct,
+                        "pval_sig_cascon" = pval_sig_cascon,
+                        "pval_sig_cas" = pval_sig_cas)
     return(par_results)
   }
 
@@ -218,19 +246,27 @@ jitter_power <- function(obs_data,
   rr_sd <- apply(sim_rr_dat, 1, sd, na.rm = TRUE) # standard deviation log relative risk
 
   ## Calculate proportion of tests were significant
-  ### Significance level is user-specified
+  ### Correction for multiple testing
+  if (p_correct != "none") {
+    pval_sig_cascon <- out_par[[13]]
+    pval_sig_cas <- out_par[[14]]
+  } else {
+  ### Uncorrected for multiple testing
   #### Case and Control (lower and upper tail)
   lower_tail <- alpha/2
   upper_tail <- 1 - lower_tail
   pval_sig_cascon <- rapply(sim_pval, function(x) ifelse(x < lower_tail | x > upper_tail,
                                                          TRUE,
                                                          FALSE),
                             how = "replace")
-  pval_count_cascon <- rowSums(do.call(cbind, pval_sig_cascon), na.rm = TRUE)
-  pval_prop_wNA_cascon <- sapply(pval_count_cascon, FUN = function(x, y = sim_total) (x / y))
   #### Case only (lower tail only)
   pval_sig_cas <- rapply(sim_pval, function(x) ifelse(x < alpha, TRUE, FALSE),
                          how = "replace")
+  }
+
+  pval_count_cascon <- rowSums(do.call(cbind, pval_sig_cascon), na.rm = TRUE)
+  pval_prop_wNA_cascon <- sapply(pval_count_cascon, FUN = function(x, y = sim_total) (x / y))
+
   pval_count_cas <- rowSums(do.call(cbind, pval_sig_cas), na.rm = TRUE)
   pval_prop_wNA_cas <- sapply(pval_count_cas, FUN = function(x, y = sim_total) (x / y))
 
@@ -258,5 +294,6 @@ jitter_power <- function(obs_data,
                   "n_con" = unlist(out_par[[8]]),
                   "bandw" = unlist(out_par[[9]]),
                   "s_obs" = unlist(out_par[[10]]),
-                  "t_obs" = unlist(out_par[[11]]))
+                  "t_obs" = unlist(out_par[[11]]),
+                  "alpha" = unlist(out_par[[12]]))
 }
diff --git a/R/pval_correct.R b/R/pval_correct.R
@@ -0,0 +1,54 @@
+#' Calculate p-value corrections
+#'
+#' Internal function to calculate various p-value corrections for use within the \code{\link{spatial_power}} and \code{\link{jitter_power}} functions.
+#'
+#' @param input An object of class 'rrs' from the \code{\link{spatial_power}} or \code{\link{jitter_power}} function.
+#' @param type Character string specifying which correction for multiple comparisons. Options include a False Discovery Rate \code{p_correct = "FDR"}, an independent Sidak correction \code{p_correct = "uncorrelated Sidak"}, and an independent Bonferroni correction \code{p_correct = "uncorrelated Bonferroni"}.
+#' @param alpha Numeric. The alpha level for significance threshold (default in \code{\link{spatial_power}} and \code{\link{jitter_power}} functions is 0.05).
+#' 
+#' @details This function provides functionality for multiple testing correction in five ways:
+#' 
+#' \enumerate{
+#' \item Computes a False Discovery Rate by Benjamini and Hochberg \doi{10.1111/j.2517-6161.1995.tb02031.x} (\code{p_correct = "FDR"}) by: 1) sorting the p-values (p_i) of each knot in ascending order (p_1 <= p_2 <= ... <= p_m), 2) starting from p_m find the first p_i for which p_i <= (i/m) * alpha.
+#' \item Computes an independent Sidak correction \doi{10.2307/2283989} (\code{p_correct = "uncorrelated Sidak"}) by 1 - (1 - \code{alpha}) ^ (1 / total number of gridded knots across the estimated surface). The default in the \code{\link[sparr]{risk}} function is a resolution of 128 x 128 or n = 16,384 knots and a custom resolution can be specified using the \code{resolution} argument within the \code{\link[sparr]{risk}} function.
+#' \item Computes an independent Bonferroni correction (\code{p_correct = "uncorrelated Bonferroni"}) by \code{alpha} / total number of gridded knots across the estimated surface. The default in the \code{\link[sparr]{risk}} function is a resolution of 128 x 128 or n = 16,384 knots and a custom resolution can be specified using the \code{resolution} argument within the \code{\link[sparr]{risk}} function.
+#' }
+#' 
+#' @return An object of class 'numeric' with the corrected alpha level.
+#'
+#' @export
+#'
+#' @keywords internal
+#' 
+pval_correct <- function(input,
+                         type = c("FDR", "Sidak", "Bonferroni"),
+                         alpha = 0.05,
+                         nbc = NULL) {
+
+# False Discovery Rate
+  if (type == "FDR") {
+  sort_pvals <- sort(as.vector(input$P$v), decreasing = TRUE)
+
+  fdr <- function(pvals, alpha) {
+    m <- length(pvals)
+    for (i in 1:length(pvals)) {
+      if (pvals[i] <= (i/m) * alpha) { return(pvals[i]) } 
+    }
+  }
+
+  out_alpha <- fdr(sort_pvals, alpha)
+  return(out_alpha)
+  }
+
+  # Uncorrelated Bonferroni correction
+  if (type == "Sidak") {
+    out_alpha <- 1 - (1 - alpha) ^ (1 / prod(input$P$dim) )
+    return(out_alpha)
+  }
+
+  # Uncorrelated Bonferroni correction
+  if (type == "Bonferroni") {
+    out_alpha <- alpha / prod(input$P$dim)
+    return(out_alpha)
+  }
+}