You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`-C, --complement-add`: Take the reverse complement and add to the data.
47
+
-`-O, --ordinal-headers`: Replace header with strict ordinals.
48
+
-`-F, --file-id <STR>`: File ID for ordinals.
49
+
-`-S, --save-quality <STR>`: Save quality file for back-mapping.
50
+
-`-A, --save-stats <STR>`: Save quality vs. length statistics file for analysis.
51
+
-`-K, --skip-remaining`: Do not output FASTA/FASTQ data (assumes -A).
52
+
-`-H, --keep-header`: Keep header as usual.
53
+
-`-c, --clip-adapter <STR>`: Clip adapter.
54
+
-`-m, --mask-adapter <STR>`: Mask adapter.
55
+
-`-Z, --fuzzy-adapter`: Allow one mismatch.
56
+
-`-U, --uracil-to-thymine`: Convert uracil to thymine.
57
+
-`-E, --enforce-clipped-length`: The minimum length threshold (-L) is enforced when adapter clipped (-c).
58
+
-`-R, --read-side <INT>`: If FASTQ header is in SRA format and missing a read identifier, alter the header.
59
+
40
60
- It uses **options** from the configuration at this step such as:
41
61
-`ADAPTER`: Transposase adapter sequence. To disable, set it as an empty string (ADAPTER=""). It trims 5′ on the forward adapter and 3′ on the reverse complement adapter. It can be applied to NextTera paired-end reads.
42
62
-`FUZZY_ADAPTER`: If ADAPTER is set and FUZZY_ADAPTER is enabled, it also trims adapters with up to 1 mismatch.
@@ -230,6 +250,13 @@ As the default `SORT_PROG` is the same as the `MATCH_PROG`, it assumes that this
230
250
231
251
6. _parseSORTresults_: Perl script. Analyzes the results generated by SORT (Sequence Occupancy Read Trace) to determine the occurrence count of each target sequence and the number of reads contributing to each occurrence (score). Additionally, the script processes information about the number of reads used to generate each occurrence, which appears to be encoded in sequence identifiers (ID) in the SORT_results.tab file. Certain filters are applied to determine which sequences are considered valid forfurther analysis. For example, you can specify a minimum read count (-C) and a minimum read pattern count (-D). You can also choose to ignore annotationsin sequence identifiers (-G). If a list of patterns is provided (-P), the script divides sequences into groups based on these patterns and selects the best sequence from each group, as well as any secondary sequence that meets the filter criteria. Patterns can also be provided to ban sequences (-B), meaning these sequences will be excluded from analysis.
232
252
253
+
- **Possible parameters**:
254
+
- `-P, --pattern-list <STRING>`: Comma-separated list of patterns to group genes. Special case __ALL__ to selectthe top gene.
255
+
- `-G, --ignore-annotations`: Ignore annotations in target identifiers.
256
+
- `-C, --min-read-count <INTEGER>`: Minimum read count threshold for a target to be considered valid (default = 1).
257
+
- `-D, --min-read-patterns <INTEGER>`: Minimum read patterns threshold for a target to be considered valid (default = 1).
258
+
- `-B, --ban-list <STRING>`: Comma-separated list of patterns to ban specific genes`
259
+
233
260
- Uses **options** from the configuration in this step such as:
234
261
- `SORT_GROUPS`: Determines sorting groups for primary and secondary data.
235
262
- `BAN_GROUPS`: Patterns not allowed.
@@ -718,7 +745,8 @@ This is done because it will store the score from the .sam file into a variable,
0 commit comments