Skip to content

Commit 71373f2

Browse files
Smedsfgvieira
authored andcommitted
feat: gatk DenoisedReadCounts wrapper (snakemake#1319)
Add gatk DenoiseReadCountWrapper <!-- Make sure that you can tick the boxes below. --> * [ ] I confirm that: For all wrappers added by this PR, * there is a test case which covers any introduced changes, * `input:` and `output:` file paths in the resulting rule can be changed arbitrarily, * either the wrapper can only use a single core, or the example rule contains a `threads: x` statement with `x` being a reasonable default, * rule names in the test case are in [snake_case](https://en.wikipedia.org/wiki/Snake_case) and somehow tell what the rule is about or match the tools purpose or name (e.g., `map_reads` for a step that maps reads), * all `environment.yaml` specifications follow [the respective best practices](https://stackoverflow.com/a/64594513/2352071), * wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in `input:` or `output:`), * all fields of the example rules in the `Snakefile`s and their entries are explained via comments (`input:`/`output:`/`params:` etc.), * `stderr` and/or `stdout` are logged correctly (`log:`), depending on the wrapped tool, * temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function `tempfile.gettempdir()` points to (see [here](https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir); this also means that using any Python `tempfile` default behavior works), * the `meta.yaml` contains a link to the documentation of the respective tool or command, * `Snakefile`s pass the linting (`snakemake --lint`), * `Snakefile`s are formatted with [snakefmt](https://github.com/snakemake/snakefmt), * Python wrapper scripts are formatted with [black](https://black.readthedocs.io). * Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays). --------- Co-authored-by: Filipe G. Vieira <[email protected]>
1 parent 2046596 commit 71373f2

File tree

5 files changed

+75
-0
lines changed

5 files changed

+75
-0
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
channels:
2+
- conda-forge
3+
- bioconda
4+
- nodefaults
5+
dependencies:
6+
- gatk4 =4.4.0.0
7+
- snakemake-wrapper-utils =0.5.3
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
name: gatk DenoiseReadCounts
2+
url: https://gatk.broadinstitute.org/hc/en-us/articles/13832751133851-DenoiseReadCounts
3+
description: |
4+
Denoises read counts to produce denoised copy ratios
5+
authors:
6+
- Patrik Smeds
7+
input:
8+
- hdf5: TSV or HDF5 file with counts from CollectReadCounts.
9+
- pon: Panel-of-normals from CreateReadCountPanelOfNormals (optional)
10+
- gc_interval: GC-content annotated-intervals from {@link AnnotateIntervals (optional)
11+
output:
12+
- std_copy_ratio: Standardized-copy-ratios file
13+
- denoised_copy_ratio: Denoised-copy-ratios file
14+
params:
15+
- java_opts: additional arguments to be passed to the java compiler, e.g. "-XX:ParallelGCThreads=10" (not for `-XmX` or `-Djava.io.tmpdir`, since they are handled automatically).
16+
- extra: additional program arguments.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
rule denoisereadcounts:
2+
input:
3+
hdf5=["a.counts.hdf5"],
4+
output:
5+
std_copy_ratio="a.standardizedCR.tsv",
6+
denoised_copy_ratio="a.denoisedCR.tsv",
7+
log:
8+
"logs/gatk/denoisereadcounts.log",
9+
params:
10+
extra="", # optional
11+
java_opts="", # optional
12+
resources:
13+
mem_mb=1024,
14+
wrapper:
15+
"master/bio/gatk/denoisereadcounts"
12.5 KB
Binary file not shown.
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
__author__ = "Patrik Smeds"
2+
__copyright__ = "Copyright 2023, Patrik Smed"
3+
__email__ = "[email protected]"
4+
__license__ = "MIT"
5+
6+
7+
import tempfile
8+
from snakemake.shell import shell
9+
from snakemake_wrapper_utils.java import get_java_opts
10+
11+
12+
panel_of_normal = ""
13+
if snakemake.input.get("pon", None):
14+
panel_of_normal = "--count-panel-of-normals {snakemake.input.pon}"
15+
16+
17+
gc_intervals = ""
18+
if snakemake.input.get("gc_interval", None):
19+
gc_intervals = "--annotated-intervals {snakemake.input.gc_interval}"
20+
21+
extra = snakemake.params.get("extra", "")
22+
java_opts = get_java_opts(snakemake)
23+
24+
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
25+
26+
with tempfile.TemporaryDirectory() as tmpdir:
27+
shell(
28+
"gatk --java-options '{java_opts}' DenoiseReadCounts"
29+
" -I {snakemake.input.hdf5} "
30+
" {panel_of_normal}"
31+
" {gc_intervals}"
32+
" --standardized-copy-ratios {snakemake.output.std_copy_ratio}"
33+
" --denoised-copy-ratios {snakemake.output.denoised_copy_ratio}"
34+
" --tmp-dir {tmpdir}"
35+
" {extra}"
36+
" {log}"
37+
)

0 commit comments

Comments
 (0)