-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a feature to generate samplesheet for differentialabundance #546
base: dev
Are you sure you want to change the base?
Changes from 5 commits
7a3b773
1d3198f
946ebb6
d05dd33
86276b0
51270f4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -744,3 +744,29 @@ For example, DIAMOND output does not have a dedicated section in the MultiQC HTM | |
</details> | ||
|
||
[Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage. | ||
|
||
### Downstream samplesheets | ||
|
||
The pipeline can also generate input files for the following downstream | ||
pipelines: | ||
|
||
- [nf-core/differentialabundance](https://nf-co.re/differentialabundance) | ||
|
||
<details markdown="1"> | ||
<summary>Output files</summary> | ||
|
||
- `downstream_samplesheets/` | ||
|
||
- `samplesheet.csv`: A samplesheet for each classifier. | ||
- `matrix.csv`: The output from taxpasta. | ||
Comment on lines
+763
to
+764
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you should copy here also the note beflow about hte missing conditions |
||
|
||
</details> | ||
|
||
To generate the downsteam samplesheets for `nf-core/differentialabundance`, you need to turn on `--run_profile_standardisation`. The pipeline only generates the `samplesheet` and `matrix` inputs. You will need to manually prepare the `contrast` table before running `nf-core/differentialabundance`. | ||
|
||
`samplesheet.csv` includes sample IDs from the `taxpasta` output for each classifier. You will need to specify the conditions (the groups you want to compare) as desired. | ||
|
||
:::warning | ||
Any generated downstream samplesheet is provided as 'best effort' and are not guaranteed to work straight out of the box! | ||
They may not be complete (e.g. some columns may need to be manually filled in). | ||
::: |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
// | ||
// Subworkflow with functionality specific to the nf-core/createtaxdb pipeline | ||
// | ||
|
||
workflow SAMPLESHEET_DIFFERENTIALABUNDANCE { | ||
take: | ||
ch_taxpasta | ||
|
||
main: | ||
format_sep = '\t' | ||
|
||
ch_taxpasta.map { it -> | ||
def tool_name = it[0]['tool'] | ||
def id = it[0]['id'] | ||
def file_path = it[1] | ||
def samplesheet_name = file(file_path).getName() | ||
|
||
ch_list_for_samplesheet = Channel | ||
.fromPath(file_path) | ||
.splitCsv(sep: format_sep) | ||
.map { row -> row.drop(1) } | ||
.flatten() | ||
|
||
ch_colnames = Channel.of('sample') | ||
|
||
channelToSamplesheet(ch_colnames, ch_list_for_samplesheet, "downstream_samplesheets/differentialabundance", "samplesheet_${samplesheet_name}" ) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please see the latest function as in detaxizer! It's tweaked again (sorry about that!) |
||
} | ||
} | ||
|
||
workflow GENERATE_DOWNSTREAM_SAMPLESHEETS { | ||
take: | ||
ch_taxpasta | ||
|
||
main: | ||
def downstreampipeline_names = params.generate_pipeline_samplesheets.split(",") | ||
|
||
if ( downstreampipeline_names.contains('differentialabundance')) { | ||
SAMPLESHEET_DIFFERENTIALABUNDANCE(ch_taxpasta) | ||
} | ||
} | ||
|
||
def channelToSamplesheet(ch_header, ch_list_for_samplesheet, outdir_subdir, samplesheet_name) { | ||
// Constructs the header string and then the strings of each row, and | ||
// finally concatenates for saving. Originally designed by @mahesh-panchal | ||
ch_header | ||
.concat(ch_list_for_samplesheet) | ||
.collectFile( | ||
name: "${params.outdir}/${outdir_subdir}/${samplesheet_name}", | ||
newLine: true, | ||
sort: false | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should only be enabled if the
--generate_samplesheet_xxxx
parameters are used!