Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

* Added `metrics/kbet_pg` and `metrics/kbet_pg_label` components (PR #52).
* Added `method/drvi` component (PR #61).
* Added `method/limma_removebatcheffect` component (PR #79).

## Minor changes

Expand Down
37 changes: 37 additions & 0 deletions src/methods/limma_removebatcheffect/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
__merge__: /src/api/comp_method.yaml
name: limma_removebatcheffect
label: limma removeBatchEffect
summary: Classical linear model-based batch correction from the limma package
description: |
The removeBatchEffect function from the limma package performs linear model-based batch correction.
It fits a linear model to remove batch effects while preserving the biological effects of interest.
This is a classical approach that works at the feature level to directly correct the expression values.

The method fits a linear model with batch as a covariate and removes the batch effects from the data
while preserving biological variation. It is particularly useful for microarray and bulk RNA-seq data,
and has been adapted for single-cell RNA-seq applications.
references:
# Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK.
# limma powers differential expression analyses for RNA-sequencing and microarray studies.
# Nucleic Acids Res. 2015;43(7):e47.
doi: 10.1093/nar/gkv007
links:
repository: https://bioconductor.org/packages/limma/
documentation: https://bioconductor.org/packages/limma/
info:
method_types: [feature]
preferred_normalization: log_cp10k
resources:
- type: r_script
path: script.R
engines:
- type: docker
image: openproblems/base_r:1
setup:
- type: r
bioc: limma
runners:
- type: executable
- type: nextflow
directives:
label: [lowcpu, midmem, midtime]
65 changes: 65 additions & 0 deletions src/methods/limma_removebatcheffect/script.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
cat("Loading dependencies\n")
suppressPackageStartupMessages({
requireNamespace("anndata", quietly = TRUE)
library(Matrix, warn.conflicts = FALSE)
library(limma, warn.conflicts = FALSE)
})

## VIASH START
par <- list(
input = 'resources_test/task_batch_integration/cxg_immune_cell_atlas/dataset.h5ad',
output = 'output.h5ad'
)
meta <- list(
name = "limma_removebatcheffect"
)
## VIASH END

cat("Read input\n")
adata <- anndata::read_h5ad(par$input)

cat("Extract data and metadata\n")
# Extract normalized data
expr_data <- t(adata$layers[["normalized"]])
batch_info <- adata$obs[["batch"]]
obs <- adata$obs
var <- adata$var

# Convert to dgCMatrix if needed
if (inherits(expr_data, "dgRMatrix")) {
dense_temp <- as.matrix(expr_data)
expr_data <- as(dense_temp, "dgCMatrix")
}

cat("Apply limma removeBatchEffect\n")
# Create design matrix (intercept only, as we want to preserve all biological variation)
design <- matrix(1, nrow = ncol(expr_data), ncol = 1)
colnames(design) <- "Intercept"
rownames(design) <- colnames(expr_data)

# Apply batch correction using limma's removeBatchEffect
corrected_data <- limma::removeBatchEffect(
x = expr_data,
batch = batch_info,
design = design
)

cat("Prepare output\n")
# Create output AnnData object with corrected feature matrix
output <- anndata::AnnData(
obs = obs[, c()],
var = var[, c()],
layers = list(
corrected_counts = t(corrected_data)
),
uns = list(
dataset_id = adata$uns[["dataset_id"]],
normalization_id = adata$uns[["normalization_id"]],
method_id = meta$name
)
)

cat("Write output to file\n")
zzz <- output$write_h5ad(par$output, compression = "gzip")

cat("Finished\n")