Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion utils/nextflow/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,8 @@ authors:
notes: |
This wrapper can e.g. be used to run `nf-core <https://nf-co.re>`_ pipelines.
In each of the nf-core pipeline descriptions, you will find available parameters and the output file structure (under "aws results").
The latter can be used to set the desired output files for this wrapper.
The latter can be used to set the desired output files for this wrapper.
params:
launch_dir: |
Allows adjusting the directory from which nextflow is launched.
Nextflow itself does not allow doing so; it always sets launchDir (read-only) from where it was invoked.
7 changes: 5 additions & 2 deletions utils/nextflow/test/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,18 @@ rule chipseq_pipeline:
gtf="data/genome.gtf",
# any --<argname> pipeline file arguments can be given here as <argname>=<path>
output:
"results/multiqc/broadPeak/multiqc_report.html",
multiqc_report="results/multiqc/broadPeak/multiqc_report.html",
# directory from which nextflow is launched, will contain the `.nextflow` directory
launch_dir=directory("some_directory"),
Comment on lines +47 to +49
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Semantically, launch_dir should not be an output.

Declaring launch_dir as a directory() output is semantically incorrect. Snakemake outputs are artifacts produced by the rule, but launch_dir is merely a working directory where Nextflow runs. The .nextflow directory created inside it is the actual artifact.

Additionally, since the wrapper changes the working directory to launch_dir before running Nextflow, all relative paths (inputs, outputs, and the outdir param) will be resolved relative to some_directory instead of the Snakemake working directory. This means:

  • output.multiqc_report declared as "results/multiqc/broadPeak/multiqc_report.html" will be created at "some_directory/results/multiqc/broadPeak/multiqc_report.html"
  • Snakemake will look for it at "results/multiqc/broadPeak/multiqc_report.html" (relative to the Snakemake working directory)
  • The test will fail with a missing output error

Consider these alternatives:

Option 1: Pre-create the directory and pass it as a param

 output:
     multiqc_report="results/multiqc/broadPeak/multiqc_report.html",
-    # directory from which nextflow is launched, will contain the `.nextflow` directory
-    launch_dir=directory("some_directory"),
 params:
     pipeline="nf-core/chipseq",
     revision="2.0.0",
     profile=["test", "docker"],
-    launch_dir=lambda wildcards, output: output.launch_dir,
+    launch_dir="some_directory",

Then add a rule to create the directory before running the test:

rule prepare_launch_dir:
    output:
        directory("some_directory")
    shell:
        "mkdir -p {output}"

Option 2: Make the wrapper convert paths to absolute before chdir

The wrapper should resolve all Snakemake input/output/params paths to absolute paths before changing directories, ensuring they remain valid after the chdir.

Run this script to verify whether the current implementation handles path resolution correctly:


🏁 Script executed:

#!/bin/bash
# Description: Check if the wrapper implementation handles path resolution after chdir

# Check if wrapper.py converts paths to absolute before chdir
rg -A 10 -B 5 'chdir.*launch_dir|os\.chdir' utils/nextflow/wrapper.py

Length of output: 263


Remove launch_dir from outputs and resolve paths before chdir

  • launch_dir is not a produced artifact and must be a param (or pre-created by a separate rule), not a directory() output.
  • The wrapper calls os.chdir(launch_dir) without converting Snakemake inputs/outputs (e.g. "results/…/multiqc_report.html") to absolute paths, so the rule will write files under some_directory/… and Snakemake won’t find them.
  • Fix by either:
    1. Pre-creating some_directory in a preparatory rule and passing it as a simple param, or
    2. Modifying the wrapper to wrap every Snakemake path with os.path.abspath(...) before performing os.chdir.
🤖 Prompt for AI Agents
In utils/nextflow/test/Snakefile around lines 47 to 49, the rule incorrectly
declares launch_dir as an output directory() while the wrapper treats it as a
runtime working directory and calls os.chdir(launch_dir), which causes
downstream output paths to be written under that directory and not found by
Snakemake; fix by removing launch_dir from the outputs and either (A) pre-create
"some_directory" in a separate preparatory rule and pass its path into this rule
as a simple param, or (B) change the wrapper to convert every Snakemake
input/output path to an absolute path (os.path.abspath(...)) before calling
os.chdir so Snakemake can still track outputs correctly.

params:
pipeline="nf-core/chipseq",
revision="2.0.0",
profile=["test", "docker"],
launch_dir=lambda wildcards, output: output.launch_dir,
# The chosen pipeline expects an --outdir to be given.
# We infer this from the output file path. Since that file path can be changed
# e.g. in case of cloud storage, we use a lambda function to infer the outdir.
outdir=lambda wildcards, output: str(Path(output[0]).parents[-2]),
outdir=lambda wildcards, output: str(Path(output.multiqc_report).parents[-2]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the outdir calculation after path resolution changes.

The calculation .parents[-2] correctly extracts "results" from the path. However, after the wrapper changes into launch_dir, this relative path needs to be handled correctly. Ensure that either:

  1. The wrapper converts this param to an absolute path before chdir, or
  2. The path is explicitly made relative to the new working directory

Run this script to check how the wrapper handles params that contain paths:


🏁 Script executed:

#!/bin/bash
# Description: Check if outdir and other path params are handled correctly after chdir

# Look for param handling around the chdir operation
rg -A 15 -B 5 'snakemake\.params' utils/nextflow/wrapper.py

Length of output: 1277


🏁 Script executed:

#!/bin/bash
# Locate add_parameter definition and inspect its behavior
rg -n 'def add_parameter' -A5 utils/nextflow/wrapper.py

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for add_parameter definition and usage across the repo
rg -n "add_parameter" -n .

Length of output: 237


Make outdir absolute before changing into launch_dir.
add_parameter appends paths verbatim and the wrapper calls os.chdir(launch_dir) before invoking Nextflow, so a relative outdir (e.g. from parents[-2]) will be resolved against launch_dir instead of the project root. Convert outdir (and any other path params) to an absolute path (e.g. via os.path.abspath(value) or Path(value).resolve()) before the os.chdir in utils/nextflow/wrapper.py.

🤖 Prompt for AI Agents
In utils/nextflow/test/Snakefile around line 58, outdir is set via parents[-2]
which produces a relative path; because utils/nextflow/wrapper.py calls
os.chdir(launch_dir) before running Nextflow, relative path params will be
resolved incorrectly. Modify utils/nextflow/wrapper.py to convert outdir and any
other path-like parameters to absolute paths (e.g., os.path.abspath(value) or
Path(value).resolve()) when add_parameter is building the parameter dict or
immediately before os.chdir, so all path params are absolute prior to changing
the working directory.

# any --<argname> pipeline arguments can be given here as <argname>=<value>
handover: True
wrapper:
Expand Down
14 changes: 8 additions & 6 deletions utils/nextflow/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,18 @@
files = ",".join(files)
add_parameter(name, files)
for name, value in snakemake.params.items():
if (
name != "pipeline"
and name != "revision"
and name != "profile"
and name != "extra"
):
if name not in {"pipeline", "revision", "profile", "extra", "launch_dir"}:
add_parameter(name, value)

log = snakemake.log_fmt_shell(stdout=False, stderr=True)
args = " ".join(args)
pipeline = snakemake.params.pipeline

launch_dir = snakemake.params.get("launch_dir")
if launch_dir:
if not os.path.isdir(launch_dir):
raise ValueError(f"launch_dir does not exist: {launch_dir}")
else:
os.chdir(launch_dir)

shell("nextflow run {pipeline} {args} {extra} {log}")
Loading