Skip to content

Commit

Permalink
Added capabiltiy for various multiple testing corrections
Browse files Browse the repository at this point in the history
Added `p_correct` arugment to spatial_power() and jitter_power()

Calls a new, internal function pval_correct() that calculates three types of corrections (FDR, Sidak, Bonferroni)

spatial_power() and jitter_power() now report the (un)corrected critical p-value of each iteration

Updated README with new example that uses a Bonferroni correction. Add one plot image. Added pval_correct() to the functions table
  • Loading branch information
Ian Buller, PhD, MA committed Jan 21, 2021
1 parent aac2016 commit b85e417
Show file tree
Hide file tree
Showing 11 changed files with 246 additions and 20 deletions.
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Generated by roxygen2: do not edit by hand

export(jitter_power)
export(pval_correct)
export(spatial_data)
export(spatial_plots)
export(spatial_power)
Expand Down
5 changes: 4 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* Changes to `spatial_data()`
* Used `spatstat.core::as.solist()` to specify appropriate class to lists
* Changes to `spatial_power()`
* Added `p_correct` arugment to apply a multiple testing correction
* Fixed typos in documentation
* Moved `cascon` argument from `spatial_power()` to `spatial_plots()`
* Estimates case-only power (lower tail) and case/control (lower and upper tail) and captures both outputs. Case-only is now a one-tailed hypothesis test by default. The arguments 'lower_tail' and 'upper_tail' have been replaced with argument 'alpha'
Expand All @@ -19,16 +20,18 @@
* Removed main panel title in plots
* Fixed default colors and removed mislabled 'midpoint' color
* Removed annotation from `plot.ppp()` in plot #3 and plot #2 (if points are not plotted)
* Added `scale` argument to scale text for higher resolution plots
* Added `scale` arguement to scale text for higher resolution plots
* Added `horizontal` argument to toggle the display of the color key to be on the right (vertical) or bottom (horizontal) of plots
* Added `plot_axes` argument to toggle the display of axes in plots
* Added `plot_square` argument to toggle the margins of plots
* Changed the value of `cex.axis` and `cex` in plots
* Changes to `jitter_power()`
* Added `p_correct` arugment to apply a multiple testing correction
* Specify all packages for functions
* Moved `cascon` argument from `jitter_power()` to `spatial_plots()`
* Estimates case-only power (lower tail) and case/control (lower and upper tail) and captures both outputs. Case-only is now a one-tailed hypothesis test by default. The arguments 'lower_tail' and 'upper_tail' have been replaced with argument 'alpha'
* Used `spatstat.core::as.solist()` to specify appropriate class to lists
* Updated example to reflect new updates
* Changes to vignette
* Set global chunk options
* Named code chunks
Expand Down
53 changes: 45 additions & 8 deletions R/jitter_power.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
#' @param samp_control Character string specifying whether to randomize the control locations uniformly (\code{samp_control="uniform"}), with complete spatial randomness (\code{samp_control="CSR"}), or multivariate normal (\code{samp_control="MVN"}).
#' @param s_control Optional. Numeric value for the standard deviation of the multivariate normal distribution in the units of the \code{obs_data}. The default value (1) assumes a unit square window. Ignored if Ignored if \code{samp_control="uniform"} or \code{samp_control="CSR"}.
#' @param alpha Optional. Numeric value of the critical p-value (default=0.05).
#' @param p_correct Optional. Character string specifying whether to apply a correction for multiple comparisons including a False Discovery Rate \code{p_correct = "FDR"}, a Sidak correction \code{p_correct = "uncorrelated Sidak"}, and a Bonferroni correction \code{p_correct = "uncorrelated Bonferroni"}. If \code{p_correct = "none"} (the default), then no correction is applied.
#' @param parallel Logical. If TRUE, will execute the function in parallel. If FALSE (the default), will not execute the function in parallel.
#' @param n_core Optional. Integer specifying the number of CPU cores on current host to use for parallelization (the default is 2 cores).
#' @param verbose Logical. If TRUE (the default), will print function progress during execution. If FALSE, will not print.
Expand All @@ -23,6 +24,8 @@
#' If \code{samp_control = "MVN"} the control locations are randomly generated assuming a multivariate normal distribution \emph{centered at each observed location}. The optional argument \code{s_control} specifies the standard deviation of the multivariate normal distribution (1 by default) in the units of the \code{obs_data}.
#'
#' The function computes a one-sided hypothesis test for case clustering (\code{alpha = 0.05} by default). The function also computes a two-sided hypothesis test for case clustering and control clustering (lower tail = 0.025 and upper tail = 0.975).
#'
#' The function has functionality for a correction for multiple testing. If \code{p_correct = "FDR"}, calculates a False Discovery Rate by Benjamini and Hochberg. If \code{p_correct = "Sidak"}, calculates a Sidak correction. If \code{p_correct = "Bonferroni"}, calculates a Bonferroni correction. If \code{p_correct = "none"} (the default), then the function does not account for multiple testing and uses the uncorrected \code{alpha} level. See the internal \code{pval_correct} function documentation for more details.
#'
#' @return An object of class "list". This is a named list with the following components:
#'
Expand All @@ -41,6 +44,7 @@
#' \item{\code{bandw}}{Vector of length \code{sim_total} of the bandwidth (of numerator) used in each iteration.}
#' \item{\code{s_obs}}{Vector of length \code{sim_total} of the global s statistic.}
#' \item{\code{t_obs}}{Vector of length \code{sim_total} of the global t statistic.}
#' \item{\code{alpha}}{Vector of length \code{sim_total} of the (un)corrected critical p-values.}
#' }
#'
#' @importFrom doParallel registerDoParallel
Expand All @@ -59,16 +63,16 @@
#' # Using the \code{\link[spatstat.data]{chorley}} dataset
#' data(chorley)
#' f1 <- jitter_power(obs_data = unique(chorley),
#' sim_total = 2,
#' samp_control = "MVN",
#' s_control = 0.01,
#' sim_total = 10,
#' samp_control = "CSR",
#' verbose = FALSE)
#'
jitter_power <- function(obs_data,
sim_total,
samp_control = c("uniform", "CSR", "MVN"),
s_control = 1,
alpha = 0.05,
p_correct = c("none", "FDR", "Sidak", "Bonferroni"),
parallel = FALSE,
n_core = 2,
verbose = TRUE,
Expand All @@ -79,6 +83,9 @@ jitter_power <- function(obs_data,
stop("Argument 'obs_data' must be of class 'ppp'")
}

match.arg(p_correct, choices = c("none", "FDR", "Sidak", "Bonferroni"))
if (is.null(p_correct)) { p_correct <- "none"}

# marked uniform ppp for controls
rcluster_control <- function(n, l, win, s, types = "control", ...) {
if (samp_control == "uniform") {
Expand Down Expand Up @@ -125,6 +132,7 @@ jitter_power <- function(obs_data,
.multicombine = TRUE,
.packages = c("sparr", "spatstat.core", "utils"),
.init = list(list(), list(), list(),
list(), list(), list(),
list(), list(), list(),
list(), list(), list(),
list(), list())) %fun% {
Expand Down Expand Up @@ -164,6 +172,23 @@ jitter_power <- function(obs_data,
sim_risk <- as.vector(t(obs_lrr$rr$v))
sim_pval <- as.vector(t(obs_lrr$P$v))

if (p_correct != "none") {
alpha_correct <- pval_correct(input = obs_lrr, type = p_correct, alpha = alpha)

#### Case and Control (lower and upper tail)
lower_tail <- alpha_correct/2
upper_tail <- 1 - lower_tail
pval_sig_cascon <- sapply(sim_pval, function(x) ifelse(x < lower_tail | x > upper_tail,
TRUE,
FALSE))
#### Case only (lower tail only)
pval_sig_cas <- sapply(sim_pval, function(x) ifelse(x < alpha_correct, TRUE, FALSE))
} else {
alpha_correct <- alpha
pval_sig_cascon <- "Uncorrected"
pval_sig_cas <- "Uncorrected"
}

### Estimated global test statistics
#### Global maximum relative risk: H0 = 1
s_obs <- max(exp(obs_lrr$rr$v[!is.na(obs_lrr$rr$v)]))
Expand Down Expand Up @@ -194,7 +219,10 @@ jitter_power <- function(obs_data,
"n_con" = con$n,
"bandw" = obs_lrr$f$h0,
"s_obs" = s_obs,
"t_obs" = t_obs)
"t_obs" = t_obs,
"alpha_correct" = alpha_correct,
"pval_sig_cascon" = pval_sig_cascon,
"pval_sig_cas" = pval_sig_cas)
return(par_results)
}

Expand All @@ -218,19 +246,27 @@ jitter_power <- function(obs_data,
rr_sd <- apply(sim_rr_dat, 1, sd, na.rm = TRUE) # standard deviation log relative risk

## Calculate proportion of tests were significant
### Significance level is user-specified
### Correction for multiple testing
if (p_correct != "none") {
pval_sig_cascon <- out_par[[13]]
pval_sig_cas <- out_par[[14]]
} else {
### Uncorrected for multiple testing
#### Case and Control (lower and upper tail)
lower_tail <- alpha/2
upper_tail <- 1 - lower_tail
pval_sig_cascon <- rapply(sim_pval, function(x) ifelse(x < lower_tail | x > upper_tail,
TRUE,
FALSE),
how = "replace")
pval_count_cascon <- rowSums(do.call(cbind, pval_sig_cascon), na.rm = TRUE)
pval_prop_wNA_cascon <- sapply(pval_count_cascon, FUN = function(x, y = sim_total) (x / y))
#### Case only (lower tail only)
pval_sig_cas <- rapply(sim_pval, function(x) ifelse(x < alpha, TRUE, FALSE),
how = "replace")
}

pval_count_cascon <- rowSums(do.call(cbind, pval_sig_cascon), na.rm = TRUE)
pval_prop_wNA_cascon <- sapply(pval_count_cascon, FUN = function(x, y = sim_total) (x / y))

pval_count_cas <- rowSums(do.call(cbind, pval_sig_cas), na.rm = TRUE)
pval_prop_wNA_cas <- sapply(pval_count_cas, FUN = function(x, y = sim_total) (x / y))

Expand Down Expand Up @@ -258,5 +294,6 @@ jitter_power <- function(obs_data,
"n_con" = unlist(out_par[[8]]),
"bandw" = unlist(out_par[[9]]),
"s_obs" = unlist(out_par[[10]]),
"t_obs" = unlist(out_par[[11]]))
"t_obs" = unlist(out_par[[11]]),
"alpha" = unlist(out_par[[12]]))
}
54 changes: 54 additions & 0 deletions R/pval_correct.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#' Calculate p-value corrections
#'
#' Internal function to calculate various p-value corrections for use within the \code{\link{spatial_power}} and \code{\link{jitter_power}} functions.
#'
#' @param input An object of class 'rrs' from the \code{\link{spatial_power}} or \code{\link{jitter_power}} function.
#' @param type Character string specifying which correction for multiple comparisons. Options include a False Discovery Rate \code{p_correct = "FDR"}, an independent Sidak correction \code{p_correct = "uncorrelated Sidak"}, and an independent Bonferroni correction \code{p_correct = "uncorrelated Bonferroni"}.
#' @param alpha Numeric. The alpha level for significance threshold (default in \code{\link{spatial_power}} and \code{\link{jitter_power}} functions is 0.05).
#'
#' @details This function provides functionality for multiple testing correction in five ways:
#'
#' \enumerate{
#' \item Computes a False Discovery Rate by Benjamini and Hochberg \doi{10.1111/j.2517-6161.1995.tb02031.x} (\code{p_correct = "FDR"}) by: 1) sorting the p-values (p_i) of each knot in ascending order (p_1 <= p_2 <= ... <= p_m), 2) starting from p_m find the first p_i for which p_i <= (i/m) * alpha.
#' \item Computes an independent Sidak correction \doi{10.2307/2283989} (\code{p_correct = "uncorrelated Sidak"}) by 1 - (1 - \code{alpha}) ^ (1 / total number of gridded knots across the estimated surface). The default in the \code{\link[sparr]{risk}} function is a resolution of 128 x 128 or n = 16,384 knots and a custom resolution can be specified using the \code{resolution} argument within the \code{\link[sparr]{risk}} function.
#' \item Computes an independent Bonferroni correction (\code{p_correct = "uncorrelated Bonferroni"}) by \code{alpha} / total number of gridded knots across the estimated surface. The default in the \code{\link[sparr]{risk}} function is a resolution of 128 x 128 or n = 16,384 knots and a custom resolution can be specified using the \code{resolution} argument within the \code{\link[sparr]{risk}} function.
#' }
#'
#' @return An object of class 'numeric' with the corrected alpha level.
#'
#' @export
#'
#' @keywords internal
#'
pval_correct <- function(input,
type = c("FDR", "Sidak", "Bonferroni"),
alpha = 0.05,
nbc = NULL) {

# False Discovery Rate
if (type == "FDR") {
sort_pvals <- sort(as.vector(input$P$v), decreasing = TRUE)

fdr <- function(pvals, alpha) {
m <- length(pvals)
for (i in 1:length(pvals)) {
if (pvals[i] <= (i/m) * alpha) { return(pvals[i]) }
}
}

out_alpha <- fdr(sort_pvals, alpha)
return(out_alpha)
}

# Uncorrelated Bonferroni correction
if (type == "Sidak") {
out_alpha <- 1 - (1 - alpha) ^ (1 / prod(input$P$dim) )
return(out_alpha)
}

# Uncorrelated Bonferroni correction
if (type == "Bonferroni") {
out_alpha <- alpha / prod(input$P$dim)
return(out_alpha)
}
}
Loading

0 comments on commit b85e417

Please sign in to comment.