Skip to content

Commit 5f2f4b0

Browse files
committed
major documentation update
1 parent b16eb53 commit 5f2f4b0

File tree

6 files changed

+69
-63
lines changed

6 files changed

+69
-63
lines changed

INSTALL.md

+16-14
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
# Docker
22
The simplest way to run mity is via docker:
33

4-
docker run drmjc/mity:0.0.1b15 -h
4+
docker run drmjc/mity:0.0.1b40 -h
55

66
# pip
7-
If you have freebayes >=1.2 and gsort installed, then pip should work well
7+
If you have freebayes >=1.2 and Brent Pederson's gsort installed, then pip should work well
88

9-
VERSION=0.0.1b15
9+
VERSION=0.0.1b40
1010
pip3 install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple mity==$VERSION
1111

1212
# manual installation
1313
If you would prefer to install mity on a fresh Ubuntu installation, the following should work.
14-
We have tested this on a fresh Ubuntu 14.04 image; We use `pyenv` to install python 3.7.4 and there
14+
We have tested this on a fresh Ubuntu 14.04 image; We use `pyenv` to install python 3.7.4, though there
1515
are a number of alternatives. YMMV.
1616

1717
# install dependencies
@@ -39,7 +39,7 @@ are a number of alternatives. YMMV.
3939
# Python 3.7.4
4040
pip install --upgrade pip
4141
export PATH=$PATH:.local/bin:$HOME/.pyenv/versions/3.7.4/bin
42-
# merge DNAnexus' PYTHONPATH with this from PYTHON3
42+
# if running on a DNANexus cloud instance, then merge DNAnexus' PYTHONPATH with this from PYTHON3
4343
export PYTHONPATH=/home/linuxbrew/.linuxbrew/lib/python3.7/site-packages:/usr/share/dnanexus/lib/python2.7/site-packages
4444

4545

@@ -60,13 +60,15 @@ Then install the system dependencies: freebayes (>=1.2.0), htslib (tabix+bgzip),
6060

6161
Either install mity globally:
6262

63-
export PYTHONPATH=/usr/share/dnanexus/lib/python2.7/site-packages
64-
export PYTHONPATH=/usr/local/lib/python3.5/dist-packages:/usr/lib/python3/dist-packages:/usr/share/dnanexus/lib/python2.7/site-packages
63+
# for most users
64+
export PYTHONPATH=/usr/local/lib/python3.7/dist-packages:/usr/lib/python3/dist-packages
65+
# for those using a DNANexus cloud instance
66+
export PYTHONPATH=/usr/local/lib/python3.7/dist-packages:/usr/lib/python3/dist-packages:/usr/share/dnanexus/lib/python2.7/site-packages
6567

6668
# fix a python version incompatibility bug in futures
6769
sudo perl -pi -e 's|raise exception_type, self._exception, self._traceback|raise Exception(self._exception).with_traceback(self._traceback)|' /usr/share/dnanexus/lib/python2.7/site-packages/concurrent/futures/_base.py
6870

69-
VERSION=0.0.1b15
71+
VERSION=0.0.1b40
7072
pip3 install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple mity==$VERSION
7173

7274
Or install mity using a virtualenv
@@ -76,17 +78,17 @@ Or install mity using a virtualenv
7678
python3 -m venv .
7779
source bin/activate
7880
./bin/pip install wheel
79-
VERSION=0.0.1b15
81+
VERSION=0.0.1b40
8082
./bin/pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple mity==$VERSION
8183

82-
# test
84+
# test on example data
8385
(These URLs valid until 26/7/2020)
8486
wget https://dl.dnanex.us/F/D/XJfjx2X139ZkzY7b29QQKBppzfj9p5V794Bfqf4G/A1.dedup.realigned.recalibrated.chrMT.bam
8587
wget https://dl.dnanex.us/F/D/qyV40Qgfj6Jgy3zZfJ07vkgXqZvJ6Fb2kXb24fyv/A1.dedup.realigned.recalibrated.chrMT.bam.bai
86-
wget https://dl.dnanex.us/F/D/pVG7PjZy4qKBB6ZKbkkF0X6kB0kxf7ZzjpK7fXjY/hs37d5.fasta-index.tar.gz
87-
tar -xzvf hs37d5.fasta-index.tar.gz; mv genome.dict hs37d5.dict; mv genome.fa hs37d5.fa; mv genome.fa.fai hs37d5.fa.fai
88-
89-
# test post-docker
88+
mity call --normalise A1.dedup.realigned.recalibrated.chrMT.bam
89+
mity report A1.dedup.realigned.recalibrated.chrMT.mity.vcf.gz
90+
91+
# test using docker
9092

9193
wget https://dl.dnanex.us/F/D/XJfjx2X139ZkzY7b29QQKBppzfj9p5V794Bfqf4G/A1.dedup.realigned.recalibrated.chrMT.bam
9294
wget https://dl.dnanex.us/F/D/qyV40Qgfj6Jgy3zZfJ07vkgXqZvJ6Fb2kXb24fyv/A1.dedup.realigned.recalibrated.chrMT.bam.bai

README.md

+49-44
Original file line numberDiff line numberDiff line change
@@ -5,87 +5,92 @@ mity is a bioinformatic analysis pipeline designed to call mitochondrial SNV and
55
* easily integrate with existing nuclear DNA analysis pipelines (mity merge)
66
* provide an annotated report, designed for clinicians and researchers to interrogate
77

8-
98
# Usage
109
mity -h
1110

1211
# Dependencies
12+
* python3 (tested on 3.7.4)
1313
* freebayes >= 1.2.0
1414
* bgzip + tabix
1515
* gsort (https://github.com/brentp/gsort)
16-
* python3 (tested on 3.7.4)
1716
* pyvcf
1817
* xlsxwriter
1918
* pandas
2019

20+
# Installation
21+
Installation instructions via Docker, pip, or manually are available in INSTALL.md
22+
2123
# Example Usage
2224
This is an example of calling variants in the Ashkenazim Trio.
2325

24-
First make sure mity is in your PATH variable.
25-
26-
```bash
27-
PATH="PATH_TO_MITY_FOLDER:${PATH}"
28-
export PATH
29-
```
30-
3126
## mity-call
3227
First run mity-call on three MT BAMs provided in mity/test_in
3328

34-
We can run it in normalised mode:
29+
We can run it in normalised mode & recommend always using --normalise (or `mity report` won't work):
3530
```bash
3631
mity call \
3732
--prefix ashkenazim \
38-
--out-folder-path test_out/normalised \
39-
--min-alternate-fraction 0.5 \
33+
--out-folder-path test_out \
4034
--region MT:1-500 \
4135
--normalise \
42-
--p 0.001 \
4336
test_in/HG002.hs37d5.2x250.small.MT.RG.bam \
4437
test_in/HG003.hs37d5.2x250.small.MT.RG.bam \
4538
test_in/HG004.hs37d5.2x250.small.MT.RG.bam
4639
```
47-
This should create test_out/normalised/ashkenazim.mity.vcf.gz and test_out/normalised/ashkenazim.mity.vcf.gz.tbi
48-
49-
We can run it without the normalisation:
50-
51-
```bash
52-
mity call \
53-
--prefix ashkenazim \
54-
--out-folder-path test_out/unnormalised \
55-
--min-alternate-fraction 0.5 \
56-
--region MT:1-500 \
57-
--p 0.001 \
58-
test_in/HG002.hs37d5.2x250.small.MT.RG.bam \
59-
test_in/HG003.hs37d5.2x250.small.MT.RG.bam \
60-
test_in/HG004.hs37d5.2x250.small.MT.RG.bam
61-
```
62-
63-
This should create test_out/unnormalised/ashkenazim.mity.vcf.gz and test_out/unnormalised/ashkenazim.mity.vcf.gz.tbi
40+
This will create `test_out/normalised/ashkenazim.mity.vcf.gz` (and tbi file).
6441

6542
## mity-report
6643

67-
We can create a mity report on the normalised VCF:
44+
We can create a `mity report` on the normalised VCF:
6845
```bash
6946
mity report \
7047
--prefix ashkenazim \
71-
--min_vaf 0.1 \
72-
--out-folder-path /Users/putticc/Projects/mity/test_out/normalised \
73-
test_out/normalised/ashkenazim.mity.vcf.gz
48+
--min_vaf 0.01 \
49+
--out-folder-path test_out \
50+
test_out/ashkenazim.mity.vcf.gz
7451
```
52+
This will create: `test_out/ashkenazim.annotated_variants.csv` and `test_out/ashkenazim.annotated_variants.xlsx`.
53+
54+
## mity-normalise
55+
High-depth sequencing and sensitive variant calling can create many variants with more than 2 alleles, and in some
56+
cases, joins two nearby variants separated by shared REF sequenced into a multi-nucleotide polymorphism
57+
as discussed in the manuscript. Here, variant normalisation relates to decomposing the multi-allelic variants and
58+
where possible, splitting multi-nucleotide polymorphisms into their cognate smaller variants. At the time of writing,
59+
all variant decomposition tools we used failed to propagate the metadata in a multi-allelic variant to the split
60+
variants which caused problems when reporting the quality scores associated with each variant.
61+
62+
Technically you can run `mity call` and `mity normalise` separately, but since `mity report` requires a normalised
63+
vcf file, we recommend running `mity call --normalise`.
7564

76-
This should create: test_out/normalised/ashkenazim.annotated_variants.csv and test_out/normalised/ashkenazim.annotated_variants.xlsx
65+
## mity-merge
66+
You can merge a nuclear vcf.gz file and a mity.vcf.gz file thereby replacing the MT calls from the nuclear VCF (
67+
presumably from a caller like HaplotypeCaller which is not able to sensitively call mitochondrial variants) with
68+
the calls from mity.
7769

78-
On the unnormalised VCF - this doesn't work.
7970
```bash
80-
mity report \
71+
mity merge \
8172
--prefix ashkenazim \
82-
--min_vaf 0.1 \
83-
--out-folder-path /Users/putticc/Projects/mity/test_out/unnormalised \
84-
test_out/unnormalised/ashkenazim.mity.vcf.gz
73+
--mity_vcf test_out/ashkenazim.mity.vcf.gz \
74+
--nuclear_vcf todo-create-example-nuclear.vcf.gz
8575
```
86-
## mity-merge
76+
77+
# Recommendations for interpreting the report
78+
Assuming that you are looking for a pathogenic variant underlying a patient with a rare genetic disorder potentially
79+
caused by a Mitochondrial mutation, then we recommend the following strategy:
80+
1. tier 1 or 2 variants included in the 'commercial_panels' column
81+
2. tier 1 or 2 variants that match the clinical presentation and the phenotype in 'disease_mitomap', preferably
82+
those that are annotated with Confirmed evidence in the 'status_mitomap' column
83+
3. exclude common variants: anything linked to 'phylotree_haplotype', high 'phylotree_haplotype', high
84+
'MGRB_frequency', high 'GenBank_frequency_mitomap'.
85+
4. consider any remaining tier 1 or 2 variants that may have a predicted impact on tRNA
86+
5. consider any remaining variants with high numbers of 'variant_references_mitomap'
87+
5. if you have analysed multiple family members, consider variants who's level of 'variant_heteroplasmy' match the
88+
disease burden
8789

8890
# Acknowledgements
89-
We thank the Kinghorn Centre for Clinical Genomics and collaborators, who helped
90-
with feedback for running mity.
91-
We thank Eric Talevich who's CNVkit helped us structure mity as a package
91+
We would like to thank
92+
* The Kinghorn Centre for Clinical Genomics and collaborators, who helped with feedback for running mity.
93+
* The Genome in a Bottle consortium for providing the test data used here
94+
* Eric Talevich who's CNVkit helped us structure mity as a package
95+
* Erik Garrison for developing FreeBayes and his early feedback in optimising FreeBayes for sensitive variant detection.
96+
* Brent Pederson for developing gsort

TODO.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,8 @@ to support hg19 then? GRCh38 and GRCh37 are the same length.
2929
* update docker image once in main pip repo - PENDING
3030

3131
# GitHub (pre-submission)
32-
* CRITICAL: improve documentation
33-
* CRITICAL: update INSTALL.md
34-
* CRITICAL: ensure there is example usage
32+
* merge branch back to master
33+
* push to KCCG
3534

3635
# DNAnexus
3736
* migrate app code to use the latest mity. either via an asset, or Docker image.

mitylib/_version.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.0.1b40"
1+
__version__ = "0.0.1b41"

mitylib/normalise.py

-1
Original file line numberDiff line numberDiff line change
@@ -1278,7 +1278,6 @@ def do_normalise(vcf, out_file=None, p=0.002, SB_range=[0.1,0.9], min_MQMR=30, m
12781278
:returns: Nothing. This creates a vcf.gz named out_file
12791279
:rtype: None
12801280
"""
1281-
print(p)
12821281
if out_file is None:
12831282
out_file = vcf.replace(".vcf.gz", ".norm.vcf.gz")
12841283

mitylib/report.py

+1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import gzip
44
import pandas
55
import os.path
6+
import xlsxwriter
67
from .util import check_missing_file, create_prefix, make_hgvs, get_annot_file
78

89
def make_table(variants, samples, vep_headers, impact_dict, min_vaf):

0 commit comments

Comments
 (0)