You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* adding the software desction to the GUI for deprecated software
* clean docs
* add workflow not active message
* Update qiita_pet/support_files/doc/source/processingdata/qp-fastp-minimap2.rst
Co-authored-by: Daniel McDonald <[email protected]>
---------
Co-authored-by: Charles Cowart <[email protected]>
Co-authored-by: Daniel McDonald <[email protected]>
At the end of August 2023, we discovered that the parameters used by
5
+
qp-fastp-minimap2 did not trigger application of adapter filtering. By default,
6
+
fastp performs autodetection of adapters and filtering for single-end data. By
7
+
default, fastp does not perform these operations on paired-end data. This behavior
8
+
was not expected by us. It was discovered when manually assessing replicated
9
+
sequences, which on examination by BLAST against NT reported to be adapters.
10
+
11
+
Adapter filtering for paired-end data with fastp requires specifying either the
12
+
exact adapters to remove (i.e., no autodetection), or to explicitly specify “--detect_adapter_for_pe”. Qiita previously indicated to users that the
13
+
qp-fastp-minimap2 plugin was performing adapter autodetection and filtering.
14
+
However, because this flag was not specified, that behavior did not occur.
15
+
16
+
In the metagenomic dataset the adapters were discovered in, we observed a few
17
+
sequences with high replication, with assignments to a few genomes in RS210.
18
+
The coverage of those genomes, using all metagenomic short reads, was constrained
19
+
to very specific regions. The replicated sequences exhibited high identity to
20
+
known adapters. As such, we suspect the replicated sequences we observed were
21
+
adapters. We suspect the observed genomes either suffer from adapter contamination
22
+
themselves, or the constructs used in the samples we examined were derived from
23
+
real organisms. Although we cannot differentiate this definitively in the data
24
+
we examined, in either case these short reads are likely artifactual.
25
+
26
+
For the dataset we examined, removal of these false positives was important
27
+
for the biological interpretation of the results. However, whether the removal
28
+
is important likely depends on the dataset and question.
29
+
30
+
qp-fastp-minimap2 has been updated to perform adapter filtering on paired-end data.
31
+
The fastp autodetection is compile-time limited to `the first 256k sequences <https://github.com/OpenGene/fastp/blob/7784d047fdf0a8df4211967156f5c97920c6d2e8/src/evaluator.cpp#L410-L417>`_.
32
+
Because of this, we opted for a more conservative approach of not relying on
33
+
autodetection and instead we now test all adapters that fastp is aware of. Specifically,
34
+
we now provide fastp a known adapters FASTA which is a serialized representation
35
+
of their `known adapter list <https://github.com/OpenGene/fastp/blob/7784d047fdf0a8df4211967156f5c97920c6d2e8/src/knownadapters.h#L11>`_.
36
+
37
+
The new command is named: `Adapter and host filtering v2023.12`.
Copy file name to clipboardExpand all lines: qiita_pet/templates/artifact_ajax/artifact_summary.html
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -125,6 +125,7 @@ <h4>
125
125
{% if processing_info['software_deprecated'] %}
126
126
<divclass="alert alert-danger" role="alert">
127
127
Danger, the software that generated this artifact was produced by a software version with a known bug and the results are wrong, please re-run with the newer version.
0 commit comments