Skip to content

Commit

Permalink
Resolve merge conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
pd3 committed Dec 16, 2024
2 parents f58fe98 + 061112d commit 40f373d
Show file tree
Hide file tree
Showing 19 changed files with 1,212 additions and 329 deletions.
615 changes: 460 additions & 155 deletions bcftools-man.html

Large diffs are not rendered by default.

615 changes: 460 additions & 155 deletions bcftools.html

Large diffs are not rendered by default.

34 changes: 32 additions & 2 deletions howtos/FAQ.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 2.0.16">
<meta name="generator" content="Asciidoctor 2.0.15.dev">
<title>Frequently Asked Questions</title>
<link rel="stylesheet" href="./index.css">
</head>
Expand Down Expand Up @@ -83,6 +83,36 @@
<div class="sect1">
<h2 id="_frequently_asked_questions">Frequently Asked Questions</h2>
<div class="sectionbody">
<div id="undefined-tag" class="paragraph">
<div class="title"><strong>'XYZ' is not defined in the header, assuming Type=String</strong></div>
<p>The <a href="https://samtools.github.io/hts-specs/VCFv4.3.pdf">VCF specification</a> recommends that all INFO and
FORMAT tags that appear throughout the file body are defined in the VCF header.</p>
</div>
<div class="paragraph">
<p>Fix the header using the reheader command</p>
</div>
<div class="listingblock">
<div class="content">
<pre># Write out the header to be modified
bcftools view -h old.vcf &gt; header.txt

# Edit the header using your favorite text editor and add the missing definition, eg
# ##INFO=&lt;ID=XYZ,Number=1,Type=Integer,Description="Describe the tag"&gt;
vi header.txt

# Reheader the file
bcftools reheader -h header.txt -o new.vcf old.vcf</pre>
</div>
</div>
<div class="paragraph">
<p>Why do you have to do it? Although VCF specification allows undefined tags, HTSlib and BCFtools internally
treat VCF as BCF, where all tags must be defined in the header. This is because of the way BCF is designed:
the tags throughout the BCF file are represented as pointers to the dictionary of tags stored in the header.
We work around this problem by adding missing definitions on the fly. Note this can work for read-only operations, but
will still lead to problems when writing the file out as BCF: even though the reader
updated its internal structures with a dummy definition and continued reading, the writer was not
aware about the new tag when the header was written.</p>
</div>
<div id="incorrect-nfields" class="paragraph">
<div class="title"><strong>Incorrect number of fields at chr1:1234567</strong></div>
<p>This error is triggered when the number of values in the data line does not match
Expand Down Expand Up @@ -110,7 +140,7 @@ <h2 id="_frequently_asked_questions">Frequently Asked Questions</h2>
</div>
</div>
<div class="paragraph">
<p>The error above is printed when different number of values is encoutered, for example <code>AC=1</code> or <code>AC=1,1,1</code> in the example above.</p>
<p>The error above is printed when different number of values is encountered, for example <code>AC=1</code> or <code>AC=1,1,1</code> in the example above.</p>
</div>
<div class="paragraph">
<p>Other such definitions are <code>Number=R</code> (there must be as many values as there are REF+ALT alleles in total),
Expand Down
30 changes: 28 additions & 2 deletions howtos/FAQ.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,34 @@ include::header.inc[]
Frequently Asked Questions
--------------------------

.*Incorrect number of fields at chr1:1234567*
.*'XYZ' is not defined in the header, assuming Type=String*
[#undefined-tag]
The link:https://samtools.github.io/hts-specs/VCFv4.3.pdf[VCF specification] recommends that all INFO and
FORMAT tags that appear throughout the file body are defined in the VCF header.

Fix the header using the reheader command
----
# Write out the header to be modified
bcftools view -h old.vcf > header.txt

# Edit the header using your favorite text editor and add the missing definition, eg
# ##INFO=<ID=XYZ,Number=1,Type=Integer,Description="Describe the tag">
vi header.txt

# Reheader the file
bcftools reheader -h header.txt -o new.vcf old.vcf
----

Why do you have to do it? Although VCF specification allows undefined tags, HTSlib and BCFtools internally
treat VCF as BCF, where all tags must be defined in the header. This is because of the way BCF is designed:
the tags throughout the BCF file are represented as pointers to the dictionary of tags stored in the header.
We work around this problem by adding missing definitions on the fly. Note this can work for read-only operations, but
will still lead to problems when writing the file out as BCF: even though the reader
updated its internal structures with a dummy definition and continued reading, the writer was not
aware about the new tag when the header was written.


.*Incorrect number of fields at chr1:1234567*
[#incorrect-nfields]
This error is triggered when the number of values in the data line does not match
its definition in the header. For example, one may see an error like
Expand All @@ -20,7 +46,7 @@ and expects a value for each ALT allele, for example
----
chr1 64334 . A C,T . . AC=1,1 GT 0/1 0/1
----
The error above is printed when different number of values is encoutered, for example `AC=1` or `AC=1,1,1` in the example above.
The error above is printed when different number of values is encountered, for example `AC=1` or `AC=1,1,1` in the example above.

Other such definitions are `Number=R` (there must be as many values as there are REF+ALT alleles in total),
and `Number=G` (this is more complicated, see the section 1.4.2 of the link:http://samtools.github.io/hts-specs/VCFv4.3.pdf[VCF specification]).
Expand Down
6 changes: 3 additions & 3 deletions howtos/bcftools.txt
Original file line number Diff line number Diff line change
Expand Up @@ -408,7 +408,7 @@ Add or remove annotations.

# Annotate from a tab-delimited file with regions (1-based coordinates, inclusive)
tabix -s1 -b2 -e3 annots.tab.gz
bcftools annotate -a annots.tab.gz -h annots.hdr -c CHROM,FROM,TO,TAG inut.vcf
bcftools annotate -a annots.tab.gz -h annots.hdr -c CHROM,FROM,TO,TAG input.vcf

# Annotate from a bed file (0-based coordinates, half-closed, half-open intervals)
bcftools annotate -a annots.bed.gz -h annots.hdr -c CHROM,FROM,TO,TAG input.vcf
Expand Down Expand Up @@ -1022,7 +1022,7 @@ See the usage examples below.
# %TBCSQ .. print consequences in both haplotypes in separate columns
# %TBCSQ{0} .. print the first haplotype only
# %TBCSQ{1} .. print the second haplotype only
# %TBCSQ{*} .. print a list of unique consquences present in either haplotype
# %TBCSQ{*} .. print a list of unique consequences present in either haplotype
bcftools query -f'[%CHROM\t%POS\t%SAMPLE\t%TBCSQ\n]' out.bcf
----

Expand Down Expand Up @@ -2069,7 +2069,7 @@ Extracts fields from VCF or BCF files and outputs them in user-defined format.
%SAMPLE Sample name
%POS0 POS in 0-based coordinates
%END End position of the REF allele
%END0 End position of the REF allele in 0-based cordinates
%END0 End position of the REF allele in 0-based coordinates
\n new line
\t tab character

Expand Down
2 changes: 1 addition & 1 deletion howtos/cnv-calling.html
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ <h3 id="_detecting_subchromosomal_cnvs">Detecting subchromosomal CNVs</h3>
</div>
<div class="listingblock">
<div class="content">
<pre>bcftools cnv -c conrol_sample -s query_sample -o outdir/ -p 0 file.vcf</pre>
<pre>bcftools cnv -c control_sample -s query_sample -o outdir/ -p 0 file.vcf</pre>
</div>
</div>
<div class="paragraph">
Expand Down
2 changes: 1 addition & 1 deletion howtos/cnv-calling.txt
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ differences between two samples. This greatly helps to reduce the number of
false calls and also allows one to distinguish between normal and novel copy number
variation. The command is
----
bcftools cnv -c conrol_sample -s query_sample -o outdir/ -p 0 file.vcf
bcftools cnv -c control_sample -s query_sample -o outdir/ -p 0 file.vcf
----
The ``-p 0`` option tells the program to automatically call matplotlib and
produce plots like the one in this example:
Expand Down
2 changes: 1 addition & 1 deletion howtos/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ <h3 id="_about_bcftools">About BCFtools</h3>
<p>BCFtools is a program for variant calling and manipulating files in the
Variant Call Format (VCF) and its binary counterpart BCF. All commands work
transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
In order to avoid tedious repetion, throughout this document we will use
In order to avoid tedious repetition, throughout this document we will use
"VCF" and "BCF" interchangeably, unless specifically noted.</p>
</div>
<div class="paragraph">
Expand Down
2 changes: 1 addition & 1 deletion howtos/index.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ https://github.com/samtools/bcftools/issues[github].
BCFtools is a program for variant calling and manipulating files in the
Variant Call Format (VCF) and its binary counterpart BCF. All commands work
transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
In order to avoid tedious repetion, throughout this document we will use
In order to avoid tedious repetition, throughout this document we will use
"VCF" and "BCF" interchangeably, unless specifically noted.

Most commands accept VCF, bgzipped VCF and BCF with filetype detected
Expand Down
2 changes: 1 addition & 1 deletion howtos/plugin.fixref.html
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ <h2 id="_plugin_fixref">Plugin fixref</h2>
</div>
<div class="paragraph">
<p>In the most extreme case when nothing else is working, one can simply force
the unambigous alleles onto the forward strand and drop the ambigous genotypes.</p>
the unambiguous alleles onto the forward strand and drop the ambiguous genotypes.</p>
</div>
<div class="listingblock">
<div class="content">
Expand Down
2 changes: 1 addition & 1 deletion howtos/plugin.fixref.txt
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ bcftools sort fixref.bcf -Ob -o fixref.sorted.bcf


In the most extreme case when nothing else is working, one can simply force
the unambigous alleles onto the forward strand and drop the ambigous genotypes.
the unambiguous alleles onto the forward strand and drop the ambiguous genotypes.
----
bcftools +fixref test.bcf -Ob -o output.bcf -- -f ref.fa -m flip -d
----
Expand Down
157 changes: 157 additions & 0 deletions howtos/plugin.setGT.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 2.0.15.dev">
<title>Plugin setGT</title>
<link rel="stylesheet" href="./index.css">
</head>
<body class="article">
<div id="header">
</div>
<div id="content">
<div class="sidebarblock navig">
<div class="content">
<div class="ulist">
<div class="title">General</div>
<ul>
<li>
<p><a href="index.html">Main page</a></p>
</li>
<li>
<p><a href="../bcftools.html">Manual page</a></p>
</li>
<li>
<p><a href="install.html">Installation</a></p>
</li>
<li>
<p><a href="publications.html">Publications</a></p>
</li>
</ul>
</div>
<div class="ulist">
<div class="title">Calling</div>
<ul>
<li>
<p><a href="cnv-calling.html">CNV calling</a></p>
</li>
<li>
<p><a href="csq-calling.html">Consequence calling</a></p>
</li>
<li>
<p><a href="consensus-sequence.html">Consensus calling</a></p>
</li>
<li>
<p><a href="roh-calling.html">ROH calling</a></p>
</li>
<li>
<p><a href="variant-calling.html">Variant calling and filtering</a></p>
</li>
</ul>
</div>
<div class="ulist">
<div class="title">Tips and Tricks</div>
<ul>
<li>
<p><a href="convert.html">Converting formats</a></p>
</li>
<li>
<p><a href="annotate.html">Adding annotation</a></p>
</li>
<li>
<p><a href="query.html">Extracting information</a></p>
</li>
<li>
<p><a href="filtering.html">Filtering expressions</a></p>
</li>
<li>
<p><a href="scaling.html">Performance and Scaling</a></p>
</li>
<li>
<p><a href="plugins.html">Plugins</a></p>
</li>
<li>
<p><a href="FAQ.html">FAQ</a></p>
</li>
</ul>
</div>
</div>
</div>
<div id="main">
<div class="sect1">
<h2 id="_plugin_setgt">Plugin setGT</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The plugin <code>+setGT</code> allows to edit genotypes</p>
</div>
<div class="paragraph">
<p>The list of plugin-specific options can be obtained by running
<code>bcftools +setGT -h</code>, which will print the following usage page:</p>
</div>
<div class="listingblock">
<div class="content">
<pre>About: Sets genotypes. The target genotypes can be specified as:
./. .. completely missing ("." or "./.", depending on ploidy)
./x .. partially missing (e.g., "./0" or ".|1" but not "./.")
. .. partially or completely missing
a .. all genotypes
b .. heterozygous genotypes failing two-tailed binomial test (example below)
q .. select genotypes using -i/-e options
r:FLOAT .. select randomly a proportion of FLOAT genotypes (can be combined with other modes)
and the new genotype can be one of:
. .. missing ("." or "./.", keeps ploidy)
0 .. reference allele (e.g. 0/0 or 0, keeps ploidy)
c:GT .. custom genotype (e.g. 0/0, 0, 0/1, m/M, 0/X overrides ploidy)
m .. minor (the second most common) allele as determined from INFO/AC or FMT/GT (e.g. 1/1 or 1, keeps ploidy)
M .. major allele as determined from INFO/AC or FMT/GT (e.g. 1/1 or 1, keeps ploidy)
X .. allele with bigger read depth as determined from FMT/AD
p .. phase genotype (0/1 becomes 0|1)
u .. unphase genotype and sort by allele (1|0 becomes 0/1)
Usage: bcftools +setGT [General Options] -- [Plugin Options]
Options:
run "bcftools plugin" for a list of common options

Plugin options:
-e, --exclude EXPR Exclude a genotype if true (requires -t q)
-i, --include EXPR include a genotype if true (requires -t q)
-n, --new-gt TYPE Genotypes to set, see above
-s, --seed INT Random seed to use with -t r [0]
-t, --target-gt TYPE Genotypes to change, see above

Example:
# set missing genotypes ("./.") to phased ref genotypes ("0|0")
bcftools +setGT in.vcf -- -t . -n 0p

# set missing genotypes with DP&gt;0 and GQ&gt;20 to ref genotypes ("0/0")
bcftools +setGT in.vcf -- -t q -n 0 -i 'GT="." &amp;&amp; FMT/DP&gt;0 &amp;&amp; GQ&gt;20'

# set partially missing genotypes to completely missing
bcftools +setGT in.vcf -- -t ./x -n .

# set heterozygous genotypes to 0/0 if binom.test(nAlt,nRef+nAlt,0.5)&lt;1e-3
bcftools +setGT in.vcf -- -t "b:AD&lt;1e-3" -n 0

# force unphased heterozygous genotype if binom.test(nAlt,nRef+nAlt,0.5)&gt;0.1
bcftools +setGT in.vcf -- -t ./x -n c:'m/M'</pre>
</div>
</div>
<div class="sect2">
<h3 id="_feedback">Feedback</h3>
<div class="paragraph">
<p>We welcome your feedback, please help us improve this page by
either opening an <a href="https://github.com/samtools/bcftools/issues">issue on github</a> or <a href="https://github.com/samtools/bcftools/tree/gh-pages/howtos">editing it directly</a> and sending
a pull request.</p>
</div>
</div>
</div>
</div>
</div>
</div>
<div id="footer">
<div id="footer-text">
</div>
</div>
</body>
</html>
Loading

0 comments on commit 40f373d

Please sign in to comment.