Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a feature to generate samplesheet for differentialabundance #546

Draft
wants to merge 6 commits into
base: dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -856,9 +856,14 @@ process {
].join(' ').trim()
}
publishDir = [
path: { "${params.outdir}/taxpasta/" },
[ path: { "${params.outdir}/taxpasta/" },
mode: params.publish_dir_mode,
pattern: '*.{tsv,csv,arrow,parquet,biom}'
],
[ path: { "${params.outdir}/downstream_samplesheets/differentialabundance/"},
mode: params.publish_dir_mode,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should only be enabled if the --generate_samplesheet_xxxx parameters are used!

pattern: '*.{tsv,csv,arrow,parquet,biom}'
]
]
}

Expand All @@ -875,9 +880,14 @@ process {
].join(' ').trim()
}
publishDir = [
path: { "${params.outdir}/taxpasta/" },
[ path: { "${params.outdir}/taxpasta/" },
mode: params.publish_dir_mode,
pattern: '*.{tsv,csv,arrow,parquet,biom}'
],
[ path: { "${params.outdir}/downstream_samplesheets/differentialabundance/" },
mode: params.publish_dir_mode,
pattern: '*.{tsv,csv,arrow,parquet,biom}'
]
]
}

Expand Down
4 changes: 4 additions & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,10 @@ params {
kraken2_save_reads = true
centrifuge_save_reads = true
run_profile_standardisation = true

// Generate downstream samplesheets
generate_downstream_samplesheets = true
generate_pipeline_samplesheets = 'differentialabundance'
}

process {
Expand Down
4 changes: 4 additions & 0 deletions conf/test_nothing.config
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ params {
run_motus = false
run_kmcp = false
run_ganon = false

// Generate downstream samplesheets
generate_downstream_samplesheets = true
generate_pipeline_samplesheets = 'differentialabundance'
}

process {
Expand Down
26 changes: 26 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -744,3 +744,29 @@ For example, DIAMOND output does not have a dedicated section in the MultiQC HTM
</details>

[Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.

### Downstream samplesheets

The pipeline can also generate input files for the following downstream
pipelines:

- [nf-core/differentialabundance](https://nf-co.re/differentialabundance)

<details markdown="1">
<summary>Output files</summary>

- `downstream_samplesheets/`

- `samplesheet.csv`: A samplesheet for each classifier.
- `matrix.csv`: The output from taxpasta.
Comment on lines +763 to +764
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should copy here also the note beflow about hte missing conditions


</details>

To generate the downsteam samplesheets for `nf-core/differentialabundance`, you need to turn on `--run_profile_standardisation`. The pipeline only generates the `samplesheet` and `matrix` inputs. You will need to manually prepare the `contrast` table before running `nf-core/differentialabundance`.

`samplesheet.csv` includes sample IDs from the `taxpasta` output for each classifier. You will need to specify the conditions (the groups you want to compare) as desired.

:::warning
Any generated downstream samplesheet is provided as 'best effort' and are not guaranteed to work straight out of the box!
They may not be complete (e.g. some columns may need to be manually filled in).
:::
4 changes: 4 additions & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,10 @@ params {
taxpasta_add_ranklineage = false
taxpasta_ignore_errors = false
standardisation_motus_generatebiom = false

// Generate downstream samplesheet
generate_downstream_samplesheets = false
generate_pipeline_samplesheets = null
}

// Load base.config by default for all pipelines
Expand Down
19 changes: 19 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,25 @@
},
"fa_icon": "fas fa-chart-line"
},
"generate_samplesheet_options": {
"title": "Downstream pipeline samplesheet generation options",
"type": "object",
"fa_icon": "fas fa-align-justify",
"description": "Options for generating input samplesheets for complementary downstream pipelines.",
"properties": {
"generate_downstream_samplesheets": {
"type": "boolean",
"description": "Turn on generation of samplesheets for downstream pipelines.",
"fa_icon": "fas fa-toggle-on"
},
"generate_pipeline_samplesheets": {
"type": "string",
"description": "Specify which pipeline to generate a samplesheet for.",
"fa_icon": "fas fa-toolbox",
"pattern": "^(differentialabundance)(?:,(differentialabundance)){0,1}"
}
}
},
"institutional_config_options": {
"title": "Institutional config options",
"type": "object",
Expand Down
52 changes: 52 additions & 0 deletions subworkflows/local/generate_downstream_samplesheet/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
//
// Subworkflow with functionality specific to the nf-core/createtaxdb pipeline
//

workflow SAMPLESHEET_DIFFERENTIALABUNDANCE {
take:
ch_taxpasta

main:
format_sep = '\t'

ch_taxpasta.map { it ->
def tool_name = it[0]['tool']
def id = it[0]['id']
def file_path = it[1]
def samplesheet_name = file(file_path).getName()

ch_list_for_samplesheet = Channel
.fromPath(file_path)
.splitCsv(sep: format_sep)
.map { row -> row.drop(1) }
.flatten()

ch_colnames = Channel.of('sample')

channelToSamplesheet(ch_colnames, ch_list_for_samplesheet, "downstream_samplesheets/differentialabundance", "samplesheet_${samplesheet_name}" )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the latest function as in detaxizer! It's tweaked again (sorry about that!)

}
}

workflow GENERATE_DOWNSTREAM_SAMPLESHEETS {
take:
ch_taxpasta

main:
def downstreampipeline_names = params.generate_pipeline_samplesheets.split(",")

if ( downstreampipeline_names.contains('differentialabundance')) {
SAMPLESHEET_DIFFERENTIALABUNDANCE(ch_taxpasta)
}
}

def channelToSamplesheet(ch_header, ch_list_for_samplesheet, outdir_subdir, samplesheet_name) {
// Constructs the header string and then the strings of each row, and
// finally concatenates for saving. Originally designed by @mahesh-panchal
ch_header
.concat(ch_list_for_samplesheet)
.collectFile(
name: "${params.outdir}/${outdir_subdir}/${samplesheet_name}",
newLine: true,
sort: false
)
}
23 changes: 14 additions & 9 deletions workflows/taxprofiler.nf
Original file line number Diff line number Diff line change
Expand Up @@ -63,15 +63,16 @@ if ( [params.taxpasta_add_name, params.taxpasta_add_rank, params.taxpasta_add_li
// SUBWORKFLOW: Consisting of a mix of local and nf-core/modules
//

include { SHORTREAD_PREPROCESSING } from '../subworkflows/local/shortread_preprocessing'
include { NONPAREIL } from '../subworkflows/local/nonpareil'
include { LONGREAD_PREPROCESSING } from '../subworkflows/local/longread_preprocessing'
include { SHORTREAD_HOSTREMOVAL } from '../subworkflows/local/shortread_hostremoval'
include { LONGREAD_HOSTREMOVAL } from '../subworkflows/local/longread_hostremoval'
include { SHORTREAD_COMPLEXITYFILTERING } from '../subworkflows/local/shortread_complexityfiltering'
include { PROFILING } from '../subworkflows/local/profiling'
include { VISUALIZATION_KRONA } from '../subworkflows/local/visualization_krona'
include { STANDARDISATION_PROFILES } from '../subworkflows/local/standardisation_profiles'
include { SHORTREAD_PREPROCESSING } from '../subworkflows/local/shortread_preprocessing'
include { NONPAREIL } from '../subworkflows/local/nonpareil'
include { LONGREAD_PREPROCESSING } from '../subworkflows/local/longread_preprocessing'
include { SHORTREAD_HOSTREMOVAL } from '../subworkflows/local/shortread_hostremoval'
include { LONGREAD_HOSTREMOVAL } from '../subworkflows/local/longread_hostremoval'
include { SHORTREAD_COMPLEXITYFILTERING } from '../subworkflows/local/shortread_complexityfiltering'
include { PROFILING } from '../subworkflows/local/profiling'
include { VISUALIZATION_KRONA } from '../subworkflows/local/visualization_krona'
include { STANDARDISATION_PROFILES } from '../subworkflows/local/standardisation_profiles'
include { GENERATE_DOWNSTREAM_SAMPLESHEETS } from '../subworkflows/local/generate_downstream_samplesheet/main.nf'

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -324,6 +325,10 @@ workflow TAXPROFILER {
ch_versions = ch_versions.mix( STANDARDISATION_PROFILES.out.versions )
}

if ( params.generate_downstream_samplesheets ) {
GENERATE_DOWNSTREAM_SAMPLESHEETS ( STANDARDISATION_PROFILES.out.taxpasta)
}

/*
MODULE: MultiQC
*/
Expand Down
Loading