Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions bio/gatk/validatevariants/environment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
channels:
- bioconda
- conda-forge
- defaults
dependencies:
- gatk4 ==4.1.4.1
- snakemake-wrapper-utils ==0.1.3
12 changes: 12 additions & 0 deletions bio/gatk/validatevariants/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name: GATK ValidateVariants
description: Interleave two paired-end FASTA/Q files
url: https://gatk.broadinstitute.org/hc/en-us/articles/360037057272-ValidateVariants
authors:
- Graeme Ford
input:
vcf: VCF file to be validated
output:
- VCF output file
params:
extra: any extra commands as a string
notes: Multiple threads can be used during compression of the output file with ``pigz``.
13 changes: 13 additions & 0 deletions bio/gatk/validatevariants/test/Snakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
rule vcf_spec_validation:
input:
vcf="sample.vcf",
output:
"results/sample_VALID.vcf",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the output should just have a .txt suffix or so, since this tool will not create a vcf file, right?

log:
"results/sample_VALID.log",
params:
R="genome.fasta",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot find this to be used in the wrapper. On the other hand, extra is missing. Maybe you meant to write

Suggested change
R="genome.fasta",
extra="", # optional extra arguments

resources:
mem_mb=1024,
wrapper:
"master/bio/gatk/validatevariants"
3 changes: 3 additions & 0 deletions bio/gatk/validatevariants/test/genome.dict
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
@HD VN:1.5
@SQ SN:ref LN:45 M5:7a66cae8ab14aef8d635bc80649e730b UR:file:/home/johannes/scms/snakemake-wrappers/bio/picard/createsequencedictionary/test/genome.fasta
@SQ SN:ref2 LN:40 M5:1636753510ec27476fdd109a6684680e UR:file:/home/johannes/scms/snakemake-wrappers/bio/picard/createsequencedictionary/test/genome.fasta
4 changes: 4 additions & 0 deletions bio/gatk/validatevariants/test/genome.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
>ref
AGCATGTTAGATAAGATAGCTGTGCTAGTAGGCAGTCAGCGCCAT
>ref2
aggttttataaaacaattaagtctacagagcaactacgcg
2 changes: 2 additions & 0 deletions bio/gatk/validatevariants/test/genome.fasta.fai
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
ref 45 5 45 46
ref2 40 57 40 41
18 changes: 18 additions & 0 deletions bio/gatk/validatevariants/test/sample.vcf
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
##fileformat=VCFv4.0
##fileDate=20170110
##source=pindel
##reference=hg38
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=HOMLEN,Number=1,Type=Integer,Description="Length of base pair identical micro-homology at event breakpoints">
##INFO=<ID=PF,Number=1,Type=Integer,Description="The number of samples carry the variant">
##INFO=<ID=HOMSEQ,Number=.,Type=String,Description="Sequence of base pair identical micro-homology at event breakpoints">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=NTLEN,Number=.,Type=Integer,Description="Number of bases inserted in place of deleted code">
##FORMAT=<ID=PL,Number=3,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Reference depth, how many reads support the reference">
##FORMAT=<ID=AD,Number=2,Type=Integer,Description="Allele depth, how many reads support this allele">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT a
ref 8 . A G . PASS END=140434574;HOMLEN=5;HOMSEQ=AAAAA;SVLEN=35;SVTYPE=INS GT:AD 0/0:317,6
ref2 10 . A AGTTA . PASS END=55238278;HOMLEN=7;HOMSEQ=CTGCCAC;SVLEN=-51;SVTYPE=DEL GT:AD 0/0:40789,1734
25 changes: 25 additions & 0 deletions bio/gatk/validatevariants/wrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
__author__ = "Graeme Ford"
__copyright__ = "Copyright 2021, Graeme Ford"
__email__ = "[email protected]"
__license__ = "MIT"

from snakemake.shell import shell
from snakemake_wrapper_utils.java import get_java_opts

extra = snakemake.params.get("extra", "")
java_opts = get_java_opts(snakemake)

log = snakemake.log_fmt_shell(stdout=True, stderr=True)

extra = snakemake.params.get("extra", "")


shell(
"gatk "
"--java-options '{java_opts}' "
"ValidateVariants "
"-V {snakemake.input.vcf} "
"{extra} "
"{log} "
"> {output}"
)