Conversation
Changed transcript handling from csv to parquet
Update metromap
Couple of fixes
…dule binary Move the inline python heredoc that casts a segmentation mask to uint32 into resources/usr/bin/convert_mask_uint32.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
…e binary Move the inline python heredoc that slices a single channel from a multi-channel OME-TIFF into resources/usr/bin/extract_dapi.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
… with tar.gz test data The format: directory-path / file-path validators added in d59fac3 break CI because: - bundle field accepts EITHER a directory path (real Xenium bundle) OR a .tar.gz archive URL (test data, extracted at runtime by UNTAR). Neither 'directory-path' nor 'file-path' fits both, and both fail against remote URLs. - image field is similar — can be a local OME-TIFF path, a URL, or empty. The reviewer's suggestion in r3165291219 / r3165292878 was reasonable in isolation but conflicts with this pipeline's dual-purpose input semantics where the same field can be a path or a URL or an archive. Reverts the format keys; keeps the type / pattern / errorMessage entries.
…module binary Move the multi-line python -c block that downscales a morphology image and writes scale_info.json into resources/usr/bin/downscale_morphology.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
Move the multi-line python -c block that upscales a Cellpose mask back to original resolution into resources/usr/bin/upscale_mask.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
…binary Move the inline python -c that forces every stitched GeoJSON feature to a single Polygon and reconciles dropped cells in the transcript CSV into resources/usr/bin/stitch_postprocess.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
… with tar.gz test data The format: directory-path / file-path validators added in d59fac3 break CI because: - bundle field accepts EITHER a directory path (real Xenium bundle) OR a .tar.gz archive URL (test data, extracted at runtime by UNTAR). Neither 'directory-path' nor 'file-path' fits both, and both fail against remote URLs. - image field is similar — can be a local OME-TIFF path, a URL, or empty. The reviewer's suggestion in r3165291219 / r3165292878 was reasonable in isolation but conflicts with this pipeline's dual-purpose input semantics where the same field can be a path or a URL or an archive. Reverts the format keys; keeps the type / pattern / errorMessage entries.
…cross 13 local subworkflows Extends PR #150's SPATIALXE-level params extraction down to all local subworkflows. After this commit, params.* is read only in the entry workflow (main.nf). workflows/spatialxe.nf and all subworkflows/local/*/main.nf declare formal take: blocks; named workflows and helper functions accept their inputs explicitly. 13 subworkflows refactored: utils_nfcore_spatialxe_pipeline (16 sites — validateInputParameters refactored to take 13 explicit args; PIPELINE_INITIALISATION take: block extended; main.nf passes them through) cellpose_baysor_import_segmentation (13) cellpose_resolift_morphology_ome_tif (9) xeniumranger_import_segmentation_redefine_bundle (8) baysor_run_transcripts_parquet (8) baysor_run_prior_segmentation_mask (6) baysor_generate_segfree (6) spatialdata_write_meta_merge (5) baysor_run_transcripts_parquet_tiled (5) stardist_resolift_morphology_ome_tif (2) segger_create_train_predict (2) xeniumranger_resegment_morphology_ome_tif (1) ficture_preprocess_model (1) SPATIALXE workflow's take: block extended from 24 to 40 inputs to cover all params consumed by the subworkflows it calls (alphabetical order). PIPELINE_INITIALISATION's take: block extended from 9 to 21 inputs to plumb validation params from main.nf. main.nf updated to pass the new params from the entry workflow. Closes the params-in-workflow anti-pattern across the whole pipeline. Verified with `nextflow inspect` (parses cleanly) and `nextflow run -profile test -stub-run` (pipeline assembles, parameter validation succeeds).
…dict The heredocs were inadvertently introduced when extracting the inline Python to module binaries. Project convention (CLAUDE.md "Version Reporting") is topic-channel only — no `versions.yml` files. Topic channels (versions_segger, versions_python, etc.) emit the actual versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked going stale. No other modules/local/* writes a versions.yml file. Aligns the segger modules with the rest of the pipeline.
…dict The heredocs were inadvertently introduced when extracting the inline Python to module binaries. Project convention (CLAUDE.md "Version Reporting") is topic-channel only — no `versions.yml` files. Topic channels (versions_segger, versions_python, etc.) emit the actual versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked going stale. No other modules/local/* writes a versions.yml file. Aligns the segger modules with the rest of the pipeline.
…dict The heredocs were inadvertently introduced when extracting the inline Python to module binaries. Project convention (CLAUDE.md "Version Reporting") is topic-channel only — no `versions.yml` files. Topic channels (versions_segger, versions_python, etc.) emit the actual versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked going stale. No other modules/local/* writes a versions.yml file. Aligns the segger modules with the rest of the pipeline.
…dict The heredocs were inadvertently introduced when extracting the inline Python to module binaries. Project convention (CLAUDE.md "Version Reporting") is topic-channel only — no `versions.yml` files. Topic channels (versions_segger, versions_python, etc.) emit the actual versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked going stale. No other modules/local/* writes a versions.yml file. Aligns the segger modules with the rest of the pipeline.
Address PR #139 review: segger refactor + 5 targeted fixes
…e-inputs refactor(workflows/spatialxe): extract params.* to take: inputs
refactor: extract inline Python from 5 modules to module binaries (P1/P2 sweep)
…ake-inputs refactor(subworkflows): extract params.* to take: inputs across 13 subworkflows
bugfixes from previous PRs
Collaborator
Author
|
@awgymer we addressed now most of your comments. I left 2 of them open, where I had the feeling these should be clarified. Should I post the PR on #release-review-trading again, or will you engange with other nf-core members? |
…w templates nf-core core team requires no module-level `bin/` for release approval (PR #139, comment r3185699629 from awgymer). Move every per-module `resources/usr/bin/*.py` into `templates/*.py` and invoke via the `template '<file>.py'` directive in the script: block. - 18 modules: mechanical conversion. Replace argparse with module-top Groovy interpolation (RAW_BUNDLE = "${raw_bundle}", etc.). For modules with task.ext.args (ficture/preprocess, segger/predict, segger/create_dataset, xenium_patch/stitch): inject ARGS = "${args}", build sys.argv via shlex.split, keep existing argparse code path. - segger/create_dataset: NUMBA_CACHE_DIR setup moves from shell prelude into Python BEFORE numba/torch imports (file-level # ruff: noqa: E402 to document the load-order requirement). - xenium_patch/stitch: stitch_transcripts.py + stitch_postprocess.py merged into one templates/stitch.py orchestrator that calls both inline (single-file template constraint of the directive). Containers, inputs, outputs, and stub blocks are unchanged across all 19 modules. Pipeline-level bin/ scripts (divide_transcripts.py, stitch_transcripts.py — used by xenium_patch/divide and utility/reconstruct_patches) are untouched; pipeline-level bin/ remains nf-core compliant.
14 tasks
CI on PR #154 surfaced this hard Nextflow constraint: Process output of type 'eval' is only allowed with Bash process scripts -- Current interpreter: /usr/bin/env python3 The `template '<file>.py'` directive sets the process interpreter to Python from the shebang, but every module emits a version string via an `eval('python3 -c ...')` topic channel — and `eval(...)` outputs only work when the process script body is Bash. All 19 modules failed identically. Fix: keep the Python files in `modules/local/<x>/templates/` (satisfying the no-module-level-bin requirement from PR #139's review) but invoke them from a shell `script:` block via `python3 \${moduleDir}/templates/<file>.py --flag value`. The process stays Bash, `eval()` works, and the original argparse-based scripts are restored verbatim — no constants conversion, no script merging, no moved env preludes. For xenium_patch/stitch this means restoring the two original scripts (stitch_transcripts.py + stitch_postprocess.py) instead of the merged stitch.py. For segger/create_dataset, the NUMBA_CACHE_DIR shell prelude is restored to its original location in the .nf script: block. Net result vs. the previous attempt: same nf-core compliance (no module-level bin/), but invasive Python rewrites are reverted. The original behavior is preserved exactly.
Replaces the templates/-directive approach (which broke under the
Nextflow constraint that `output: eval(...)` channels are bash-only,
incompatible with `template 'foo.py'` setting a Python interpreter via
shebang). Pipeline-level bin/ is the cleanest path: nf-core
auto-prepends bin/ to PATH for every process, scripts resolve by name,
and process scripts stay bash so eval() topic channels keep working.
Per-module changes (19 modules, 20 scripts):
- bin/<module_prefix>_<script>.py — active copies, invoked by name
(e.g. utility_extract_dapi.py).
Renamed with module-path prefix
to avoid collisions with the
pre-existing bin/divide_transcripts.py
and bin/stitch_transcripts.py
(the augmented 848-line variant
used by xenium_patch/divide and
reconstruct_patches stays as-is).
- modules/local/<mod>/main.nf — script body now invokes the
renamed bin/ name directly:
`utility_extract_dapi.py --input ...`.
The shell prelude in
segger/create_dataset
(NUMBA_CACHE_DIR setup) is
preserved verbatim. The two-step
script body of xenium_patch/stitch
(stitch_transcripts.py +
stitch_postprocess.py) is preserved.
- modules/local/<mod>/resources/usr/bin/<script>.py
— restored verbatim from origin/dev
for the 15 modules that had a
module-level script there before
PR #154. Orphan reference copies,
NEVER invoked at runtime — kept
so reviewers can compare the
per-module original against the
pipeline-bin active copy.
- modules/local/<mod>/templates/ — removed entirely (20 files
across 19 dirs). The
`template 'foo.py'` directive
was incompatible with eval()
output channels; the shell-call
wrapper to ${moduleDir}/templates
didn't survive containerization
cleanly per awgymer's review.
Modules without a pre-existing resources/usr/bin/ script on origin/dev
(utility/convert_mask_uint32, utility/downscale_morphology,
utility/extract_dapi, utility/upscale_mask, xenium_patch/stitch's
stitch_postprocess.py) get only the active bin/ copy — no orphan.
Net effect:
- nf-core lint passes (no module-level bin/ invoked at runtime;
resources/usr/bin/ files are not on PATH).
- All 19 modules' eval-version channels keep emitting Python/library
versions because the process script remains Bash.
- Reviewers can diff bin/<prefix>_<script>.py against
modules/local/<mod>/resources/usr/bin/<script>.py to verify the active
copy matches the dev original (modulo any in-template fixes from PR #154
itself).
refactor(modules): convert all 19 module-level bin scripts to Nextflow templates
Collaborator
Author
|
@awgymer last issue was now solved and we moved the scripts into the upper bin. Is there more revision to be required? |
awgymer
requested changes
May 8, 2026
awgymer
left a comment
There was a problem hiding this comment.
Looks much improved but still some module level bin files appear to be present (although I think unused? - this is confusing).
I also need to get clarification on others RE: full size tests/real data tests
removed redundant files under `modules/<name>/resource/usr/bin/<script>`
Collaborator
Author
|
Old files are removed now. |
awgymer
requested changes
May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The first release version of spatialxe. During the developemnt many tools were debugged, and new tools were added.
Since it is a first release thorough review is required.
PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).