Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JOSS] Functionality documentation #18

Open
1 of 3 tasks
rcannood opened this issue Jan 23, 2025 · 2 comments
Open
1 of 3 tasks

[JOSS] Functionality documentation #18

rcannood opened this issue Jan 23, 2025 · 2 comments

Comments

@rcannood
Copy link
Contributor

rcannood commented Jan 23, 2025

I'm reviewing the JOSS submission of AliNe in openjournals/joss-reviews#7545.

More specifically, I'm reviewing the functionality and the documentation theroef.

Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

I ran the following command to test the functionality of AliNe:

nextflow run Juke34/AliNe \
  -r v.1.1.0 \
  -latest \
  -profile docker \
  --reads https://github.com/Juke34/AliNe/raw/refs/heads/main/test/illumina/yeast_R1.fastq.gz \
  --genome https://raw.githubusercontent.com/Juke34/AliNe/refs/heads/main/test/yeast.fa \
  --read_type short_single \
  --aligner bbmap,bowtie2,bwaaln,bwamem,bwasw,graphmap2,hisat2,minimap2,ngmlr,nucmer,star,subread,sublong,tophat2,kallisto \
  --trimming_fastp \
  --star_options "--genomeSAindexNbases 9" \
  --multiqc_config https://raw.githubusercontent.com/Juke34/AliNe/refs/heads/main/config/multiqc_conf.yml

This results in the creation of the following output:

Output of `tree`
run2
├── alignment_results
│   ├── alignment
│   │   ├── bbmap
│   │   │   ├── indicies
│   │   │   │   └── ref
│   │   │   │       ├── genome
│   │   │   │       │   └── 1
│   │   │   │       │       ├── chr1.chrom.gz
│   │   │   │       │       ├── info.txt
│   │   │   │       │       ├── scaffolds.txt.gz
│   │   │   │       │       └── summary.txt
│   │   │   │       └── index
│   │   │   │           └── 1
│   │   │   │               ├── chr1_index_k13_c10_b1.block
│   │   │   │               └── chr1_index_k13_c10_b1.block2.gz
│   │   │   ├── stats
│   │   │   │   ├── yeast_R1_seqkit_trim_aqhist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim.bbmap.log.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_bhist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_ehist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_gchist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_idhist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_ihist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_indelhist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_lhist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_mhist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_qahist.txt
│   │   │   │   ├── yeast_R1_seqkit_trim_qhist.txt
│   │   │   │   └── yeast_R1_seqkit_trim_scafstats.txt
│   │   │   └── yeast_R1_seqkit_trim_sorted.bam
│   │   ├── bowtie2
│   │   │   ├── indicies
│   │   │   │   ├── yeast.1.bt2
│   │   │   │   ├── yeast.2.bt2
│   │   │   │   ├── yeast.3.bt2
│   │   │   │   ├── yeast.4.bt2
│   │   │   │   ├── yeast.rev.1.bt2
│   │   │   │   └── yeast.rev.2.bt2
│   │   │   ├── yeast_R1_seqkit_trim_bowtie2_sorted.bam
│   │   │   └── yeast_R1_seqkit_trim.fastq_bowtie2.log
│   │   ├── bwa
│   │   │   ├── bwaaln
│   │   │   │   └── yeast_R1_seqkit_trim_bwaaln_sorted.bam
│   │   │   ├── bwamem
│   │   │   │   └── yeast_R1_seqkit_trim_bwamem_sorted.bam
│   │   │   ├── bwasw
│   │   │   │   └── yeast_R1_seqkit_trim_bwasw_sorted.bam
│   │   │   └── indicies
│   │   │       ├── yeast.amb
│   │   │       ├── yeast.ann
│   │   │       ├── yeast.bwt
│   │   │       ├── yeast.pac
│   │   │       └── yeast.sa
│   │   ├── graphmap2
│   │   │   ├── indicies
│   │   │   │   └── yeast.fa.gmidx
│   │   │   ├── yeast_R1_seqkit_trim_graphmap2.log
│   │   │   └── yeast_R1_seqkit_trim_graphmap2_sorted.bam
│   │   ├── hisat2
│   │   │   ├── indicies
│   │   │   │   ├── yeast.hisat2_index.1.ht2
│   │   │   │   ├── yeast.hisat2_index.2.ht2
│   │   │   │   ├── yeast.hisat2_index.3.ht2
│   │   │   │   ├── yeast.hisat2_index.4.ht2
│   │   │   │   ├── yeast.hisat2_index.5.ht2
│   │   │   │   ├── yeast.hisat2_index.6.ht2
│   │   │   │   ├── yeast.hisat2_index.7.ht2
│   │   │   │   └── yeast.hisat2_index.8.ht2
│   │   │   ├── yeast_R1.hisat2-summary.txt
│   │   │   ├── yeast_R1_seqkit_trim_sorted.bam
│   │   │   └── yeast_R1_splicesite.txt
│   │   ├── kallisto
│   │   │   ├── indicies
│   │   │   │   └── yeast.kallisto_index
│   │   │   └── yeast_R1_seqkit_trim.bam
│   │   │       ├── abundance.h5
│   │   │       ├── abundance.tsv
│   │   │       ├── pseudoalignments.bam
│   │   │       └── run_info.json
│   │   ├── minimap2
│   │   │   ├── indicies
│   │   │   │   └── yeast.mmi
│   │   │   ├── yeast_R1_seqkit_trim_minimap2.log
│   │   │   └── yeast_R1_seqkit_trim_minimap2_sorted.bam
│   │   ├── ngmlr
│   │   │   ├── yeast_R1_seqkit_trim_ngmlr.log
│   │   │   └── yeast_R1_seqkit_trim_ngmlr_sorted.bam
│   │   ├── nucmer
│   │   │   └── yeast_R1_seqkit_trim_nucmer.fixed_sorted.bam
│   │   ├── star
│   │   │   ├── indicies
│   │   │   │   └── yeast_star_indicies
│   │   │   │       ├── chrLength.txt
│   │   │   │       ├── chrNameLength.txt
│   │   │   │       ├── chrName.txt
│   │   │   │       ├── chrStart.txt
│   │   │   │       ├── Genome
│   │   │   │       ├── genomeParameters.txt
│   │   │   │       ├── Log.out
│   │   │   │       ├── SA
│   │   │   │       └── SAindex
│   │   │   ├── yeast_R1_seqkit_trimAligned.sortedByCoord.out.bam
│   │   │   ├── yeast_R1_seqkit_trimLog.final.out
│   │   │   ├── yeast_R1_seqkit_trimLog.out
│   │   │   ├── yeast_R1_seqkit_trimLog.progress.out
│   │   │   └── yeast_R1_seqkit_trimSJ.out.tab
│   │   ├── sublong
│   │   │   ├── indicies
│   │   │   │   ├── yeast_index.00.b.array
│   │   │   │   ├── yeast_index.00.b.tab
│   │   │   │   ├── yeast_index.files
│   │   │   │   ├── yeast_index.log
│   │   │   │   └── yeast_index.reads
│   │   │   ├── yeast_R1_seqkit_trim_sorted.bam
│   │   │   └── yeast_R1_seqkit_trim_sublong.log
│   │   ├── subread
│   │   │   └── indicies
│   │   │       ├── yeast_index.00.b.array
│   │   │       ├── yeast_index.00.b.tab
│   │   │       ├── yeast_index.files
│   │   │       ├── yeast_index.log
│   │   │       └── yeast_index.reads
│   │   └── tophat2
│   │       ├── indicies
│   │       │   ├── yeast.1.bt2
│   │       │   ├── yeast.2.bt2
│   │       │   ├── yeast.3.bt2
│   │       │   ├── yeast.4.bt2
│   │       │   ├── yeast.rev.1.bt2
│   │       │   └── yeast.rev.2.bt2
│   │       ├── yeast_R1_seqkit_trim
│   │       │   ├── yeast_R1_seqkit_trim
│   │       │   ├── yeast_R1_seqkit_trim_tophat2_align_summary.txt
│   │       │   ├── yeast_R1_seqkit_trim_tophat2_deletions.bed
│   │       │   ├── yeast_R1_seqkit_trim_tophat2_insertions.bed
│   │       │   ├── yeast_R1_seqkit_trim_tophat2_junctions.bed
│   │       │   ├── yeast_R1_seqkit_trim_tophat2_logs
│   │       │   │   ├── bowtie_build.log
│   │       │   │   ├── bowtie.left_kept_reads.log
│   │       │   │   ├── bowtie.left_kept_reads_seg1.log
│   │       │   │   ├── bowtie.left_kept_reads_seg2.log
│   │       │   │   ├── bowtie.left_kept_reads_seg3.log
│   │       │   │   ├── bowtie.left_kept_reads_seg4.log
│   │       │   │   ├── bowtie.left_kept_reads_seg5.log
│   │       │   │   ├── juncs_db.log
│   │       │   │   ├── long_spanning_reads.segs.log
│   │       │   │   ├── prep_reads.log
│   │       │   │   ├── reports.log
│   │       │   │   ├── reports.samtools_sort.log0
│   │       │   │   ├── run.log
│   │       │   │   ├── segment_juncs.log
│   │       │   │   └── tophat.log
│   │       │   └── yeast_R1_seqkit_trim_tophat2_prep_reads.info
│   │       └── yeast_R1_seqkit_trim_tophat2_accepted_hits_sorted.bam
│   ├── fastp
│   │   ├── yeast_R1_fastp_report.html
│   │   └── yeast_R1_seqkit_trim.fastq.gz
│   ├── mean_read_length
│   │   └── yeast_R1_seqkit_trim_sampled_read_length.txt
│   ├── MultiQC
│   │   ├── multiqc_data
│   │   │   ├── mqc_bowtie2_se_plot_1.txt
│   │   │   ├── mqc_hisat2_se_plot_1.txt
│   │   │   ├── mqc_kallisto_alignment_1.txt
│   │   │   ├── mqc_star_alignment_plot_1.txt
│   │   │   ├── multiqc_bowtie2.txt
│   │   │   ├── multiqc_citations.txt
│   │   │   ├── multiqc_data.json
│   │   │   ├── multiqc_general_stats.txt
│   │   │   ├── multiqc_hisat2.txt
│   │   │   ├── multiqc_kallisto.txt
│   │   │   ├── multiqc.log
│   │   │   ├── multiqc_sources.txt
│   │   │   └── multiqc_star.txt
│   │   └── multiqc_report.html
│   └── seqkit_score
│       └── yeast_R1.result.txt
└── pipeline_report
    ├── execution_report_20250123202023.html
    ├── execution_timeline_20250123202023.html
    ├── execution_trace_20250123202023.txt
    └── pipeline_dag.svg

Interestingly, even though tophat2 is supposed to have been removed (See #5, #15), yet my output still contains the alignment_results/alignment/tophat2 directory.


  • pipeline_report/: This directory seems to contain standard Nextflow reports regarding the execution time of everything and so on.
  • alignment_results:
    • alignment: outputs of the different aligners. Are the outputs always named alignment_results/alignment/<tool>/<sample>_seqkit_trim_sorted.bam?
    • fastp: ?
    • mean_read_length: ?
    • MultiQC: ?
    • seqkit_score: ?

The manuscript currently mentions:

These aligners are integrated into a single, easy-to-use workflow, providing a unified entry point for any project requiring alignment and giving users the flexibility to choose the best tool for their specific data and objectives. AliNe is designed to minimize user inputs and avoid common parameter mistakes ( e.g. scoring system, strandedness, orientation).

By integrating multiple aligners into a unified pipeline, AliNe simplifies the process of alignment algorithm selection and establishes a standardized workflow that ensures both reproducibility and transparency across analyses


The manuscript also mentions:

Moreover, AliNe can be integrated as a sub-workflow into other Nextflow pipeline and provide a standardize step that offer users a wider choice of alignment tools.

@Juke34
Copy link
Owner

Juke34 commented Jan 23, 2025

The last release point to an ancient commit where tophat was still around. As soon as I fix the last concerns I will update the release appropriately.
I will answer the different problems you raised in separate issues:

@Juke34
Copy link
Owner

Juke34 commented Jan 28, 2025

I have now updated the release with all the mentioned changes hoping it address your concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants