Skip to content

Release version 1.0.0#139

Open
heylf wants to merge 404 commits intomasterfrom
dev
Open

Release version 1.0.0#139
heylf wants to merge 404 commits intomasterfrom
dev

Conversation

@heylf
Copy link
Copy Markdown
Collaborator

@heylf heylf commented Apr 28, 2026

The first release version of spatialxe. During the developemnt many tools were debugged, and new tools were added.
Since it is a first release thorough review is required.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [ ] If necessary, also make a PR on the nf-core/spatialxe branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

khersameesh24 and others added 30 commits May 20, 2025 12:07
Changed transcript handling from csv to parquet
an-altosian and others added 18 commits April 30, 2026 20:14
…dule binary

Move the inline python heredoc that casts a segmentation mask to uint32 into resources/usr/bin/convert_mask_uint32.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
…e binary

Move the inline python heredoc that slices a single channel from a multi-channel OME-TIFF into resources/usr/bin/extract_dapi.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
… with tar.gz test data

The format: directory-path / file-path validators added in d59fac3 break CI
because:
  - bundle field accepts EITHER a directory path (real Xenium bundle) OR a
    .tar.gz archive URL (test data, extracted at runtime by UNTAR). Neither
    'directory-path' nor 'file-path' fits both, and both fail against remote URLs.
  - image field is similar — can be a local OME-TIFF path, a URL, or empty.

The reviewer's suggestion in r3165291219 / r3165292878 was reasonable in
isolation but conflicts with this pipeline's dual-purpose input semantics
where the same field can be a path or a URL or an archive.

Reverts the format keys; keeps the type / pattern / errorMessage entries.
…module binary

Move the multi-line python -c block that downscales a morphology image and writes scale_info.json into resources/usr/bin/downscale_morphology.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
Move the multi-line python -c block that upscales a Cellpose mask back to original resolution into resources/usr/bin/upscale_mask.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
…binary

Move the inline python -c that forces every stitched GeoJSON feature to a single Polygon and reconciles dropped cells in the transcript CSV into resources/usr/bin/stitch_postprocess.py. Containers stay general; pipeline-specific logic lives in module binaries on PATH via the resources/usr/bin/ mount.
… with tar.gz test data

The format: directory-path / file-path validators added in d59fac3 break CI
because:
  - bundle field accepts EITHER a directory path (real Xenium bundle) OR a
    .tar.gz archive URL (test data, extracted at runtime by UNTAR). Neither
    'directory-path' nor 'file-path' fits both, and both fail against remote URLs.
  - image field is similar — can be a local OME-TIFF path, a URL, or empty.

The reviewer's suggestion in r3165291219 / r3165292878 was reasonable in
isolation but conflicts with this pipeline's dual-purpose input semantics
where the same field can be a path or a URL or an archive.

Reverts the format keys; keeps the type / pattern / errorMessage entries.
…cross 13 local subworkflows

Extends PR #150's SPATIALXE-level params extraction down to all local
subworkflows. After this commit, params.* is read only in the entry
workflow (main.nf). workflows/spatialxe.nf and all
subworkflows/local/*/main.nf declare formal take: blocks; named workflows
and helper functions accept their inputs explicitly.

13 subworkflows refactored:
  utils_nfcore_spatialxe_pipeline (16 sites — validateInputParameters
    refactored to take 13 explicit args; PIPELINE_INITIALISATION take:
    block extended; main.nf passes them through)
  cellpose_baysor_import_segmentation (13)
  cellpose_resolift_morphology_ome_tif (9)
  xeniumranger_import_segmentation_redefine_bundle (8)
  baysor_run_transcripts_parquet (8)
  baysor_run_prior_segmentation_mask (6)
  baysor_generate_segfree (6)
  spatialdata_write_meta_merge (5)
  baysor_run_transcripts_parquet_tiled (5)
  stardist_resolift_morphology_ome_tif (2)
  segger_create_train_predict (2)
  xeniumranger_resegment_morphology_ome_tif (1)
  ficture_preprocess_model (1)

SPATIALXE workflow's take: block extended from 24 to 40 inputs to cover
all params consumed by the subworkflows it calls (alphabetical order).
PIPELINE_INITIALISATION's take: block extended from 9 to 21 inputs to
plumb validation params from main.nf. main.nf updated to pass the new
params from the entry workflow.

Closes the params-in-workflow anti-pattern across the whole pipeline.
Verified with `nextflow inspect` (parses cleanly) and `nextflow run -profile
test -stub-run` (pipeline assembles, parameter validation succeeds).
…dict

The heredocs were inadvertently introduced when extracting the inline
Python to module binaries. Project convention (CLAUDE.md "Version
Reporting") is topic-channel only — no `versions.yml` files. Topic
channels (versions_segger, versions_python, etc.) emit the actual
versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked
going stale.

No other modules/local/* writes a versions.yml file. Aligns the segger
modules with the rest of the pipeline.
…dict

The heredocs were inadvertently introduced when extracting the inline
Python to module binaries. Project convention (CLAUDE.md "Version
Reporting") is topic-channel only — no `versions.yml` files. Topic
channels (versions_segger, versions_python, etc.) emit the actual
versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked
going stale.

No other modules/local/* writes a versions.yml file. Aligns the segger
modules with the rest of the pipeline.
…dict

The heredocs were inadvertently introduced when extracting the inline
Python to module binaries. Project convention (CLAUDE.md "Version
Reporting") is topic-channel only — no `versions.yml` files. Topic
channels (versions_segger, versions_python, etc.) emit the actual
versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked
going stale.

No other modules/local/* writes a versions.yml file. Aligns the segger
modules with the rest of the pipeline.
…dict

The heredocs were inadvertently introduced when extracting the inline
Python to module binaries. Project convention (CLAUDE.md "Version
Reporting") is topic-channel only — no `versions.yml` files. Topic
channels (versions_segger, versions_python, etc.) emit the actual
versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked
going stale.

No other modules/local/* writes a versions.yml file. Aligns the segger
modules with the rest of the pipeline.
Address PR #139 review: segger refactor + 5 targeted fixes
…e-inputs

refactor(workflows/spatialxe): extract params.* to take: inputs
refactor: extract inline Python from 5 modules to module binaries (P1/P2 sweep)
…ake-inputs

refactor(subworkflows): extract params.* to take: inputs across 13 subworkflows
bugfixes from previous PRs
@heylf
Copy link
Copy Markdown
Collaborator Author

heylf commented May 4, 2026

@awgymer we addressed now most of your comments. I left 2 of them open, where I had the feeling these should be clarified. Should I post the PR on #release-review-trading again, or will you engange with other nf-core members?

…w templates

nf-core core team requires no module-level `bin/` for release approval
(PR #139, comment r3185699629 from awgymer). Move every per-module
`resources/usr/bin/*.py` into `templates/*.py` and invoke via the
`template '<file>.py'` directive in the script: block.

- 18 modules: mechanical conversion. Replace argparse with module-top
  Groovy interpolation (RAW_BUNDLE = "${raw_bundle}", etc.). For modules
  with task.ext.args (ficture/preprocess, segger/predict,
  segger/create_dataset, xenium_patch/stitch): inject ARGS = "${args}",
  build sys.argv via shlex.split, keep existing argparse code path.
- segger/create_dataset: NUMBA_CACHE_DIR setup moves from shell prelude
  into Python BEFORE numba/torch imports (file-level # ruff: noqa: E402
  to document the load-order requirement).
- xenium_patch/stitch: stitch_transcripts.py + stitch_postprocess.py
  merged into one templates/stitch.py orchestrator that calls both
  inline (single-file template constraint of the directive).

Containers, inputs, outputs, and stub blocks are unchanged across all 19
modules. Pipeline-level bin/ scripts (divide_transcripts.py,
stitch_transcripts.py — used by xenium_patch/divide and
utility/reconstruct_patches) are untouched; pipeline-level bin/ remains
nf-core compliant.
an-altosian and others added 3 commits May 5, 2026 16:43
CI on PR #154 surfaced this hard Nextflow constraint:

    Process output of type 'eval' is only allowed with Bash process scripts
    -- Current interpreter: /usr/bin/env python3

The `template '<file>.py'` directive sets the process interpreter to
Python from the shebang, but every module emits a version string via an
`eval('python3 -c ...')` topic channel — and `eval(...)` outputs only
work when the process script body is Bash. All 19 modules failed
identically.

Fix: keep the Python files in `modules/local/<x>/templates/`
(satisfying the no-module-level-bin requirement from PR #139's review)
but invoke them from a shell `script:` block via
`python3 \${moduleDir}/templates/<file>.py --flag value`. The process
stays Bash, `eval()` works, and the original argparse-based scripts are
restored verbatim — no constants conversion, no script merging, no
moved env preludes.

For xenium_patch/stitch this means restoring the two original scripts
(stitch_transcripts.py + stitch_postprocess.py) instead of the merged
stitch.py. For segger/create_dataset, the NUMBA_CACHE_DIR shell prelude
is restored to its original location in the .nf script: block.

Net result vs. the previous attempt: same nf-core compliance (no
module-level bin/), but invasive Python rewrites are reverted. The
original behavior is preserved exactly.
Replaces the templates/-directive approach (which broke under the
Nextflow constraint that `output: eval(...)` channels are bash-only,
incompatible with `template 'foo.py'` setting a Python interpreter via
shebang). Pipeline-level bin/ is the cleanest path: nf-core
auto-prepends bin/ to PATH for every process, scripts resolve by name,
and process scripts stay bash so eval() topic channels keep working.

Per-module changes (19 modules, 20 scripts):

- bin/<module_prefix>_<script>.py    — active copies, invoked by name
                                       (e.g. utility_extract_dapi.py).
                                       Renamed with module-path prefix
                                       to avoid collisions with the
                                       pre-existing bin/divide_transcripts.py
                                       and bin/stitch_transcripts.py
                                       (the augmented 848-line variant
                                       used by xenium_patch/divide and
                                       reconstruct_patches stays as-is).
- modules/local/<mod>/main.nf       — script body now invokes the
                                       renamed bin/ name directly:
                                       `utility_extract_dapi.py --input ...`.
                                       The shell prelude in
                                       segger/create_dataset
                                       (NUMBA_CACHE_DIR setup) is
                                       preserved verbatim. The two-step
                                       script body of xenium_patch/stitch
                                       (stitch_transcripts.py +
                                       stitch_postprocess.py) is preserved.
- modules/local/<mod>/resources/usr/bin/<script>.py
                                     — restored verbatim from origin/dev
                                       for the 15 modules that had a
                                       module-level script there before
                                       PR #154. Orphan reference copies,
                                       NEVER invoked at runtime — kept
                                       so reviewers can compare the
                                       per-module original against the
                                       pipeline-bin active copy.
- modules/local/<mod>/templates/    — removed entirely (20 files
                                       across 19 dirs). The
                                       `template 'foo.py'` directive
                                       was incompatible with eval()
                                       output channels; the shell-call
                                       wrapper to ${moduleDir}/templates
                                       didn't survive containerization
                                       cleanly per awgymer's review.

Modules without a pre-existing resources/usr/bin/ script on origin/dev
(utility/convert_mask_uint32, utility/downscale_morphology,
utility/extract_dapi, utility/upscale_mask, xenium_patch/stitch's
stitch_postprocess.py) get only the active bin/ copy — no orphan.

Net effect:
- nf-core lint passes (no module-level bin/ invoked at runtime;
  resources/usr/bin/ files are not on PATH).
- All 19 modules' eval-version channels keep emitting Python/library
  versions because the process script remains Bash.
- Reviewers can diff bin/<prefix>_<script>.py against
  modules/local/<mod>/resources/usr/bin/<script>.py to verify the active
  copy matches the dev original (modulo any in-template fixes from PR #154
  itself).
refactor(modules): convert all 19 module-level bin scripts to Nextflow templates
@heylf
Copy link
Copy Markdown
Collaborator Author

heylf commented May 7, 2026

@awgymer last issue was now solved and we moved the scripts into the upper bin. Is there more revision to be required?

Copy link
Copy Markdown

@awgymer awgymer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much improved but still some module level bin files appear to be present (although I think unused? - this is confusing).

I also need to get clarification on others RE: full size tests/real data tests

Comment thread modules/local/baysor/preprocess/resources/usr/bin/preprocess_transcripts.py Outdated
Comment thread modules/local/segger/create_dataset/resources/usr/bin/run_create_dataset.py Outdated
Comment thread modules/local/segger/predict/resources/usr/bin/run_predict.py Outdated
Comment thread modules/local/utility/parquet_to_csv/resources/usr/bin/parquet_to_csv.py Outdated
khersameesh24 and others added 2 commits May 8, 2026 13:47
@heylf
Copy link
Copy Markdown
Collaborator Author

heylf commented May 8, 2026

Old files are removed now.

Comment thread workflows/spatialxe.nf
Comment thread workflows/spatialxe.nf
Comment thread workflows/spatialxe.nf
Comment thread modules/local/baysor/run/main.nf
Comment thread subworkflows/local/baysor_run_transcripts_parquet_tiled/main.nf
Comment thread modules/local/spatialdata/meta/main.nf
Comment thread modules/local/spatialdata/meta/main.nf
Comment thread modules/local/spatialdata/write/main.nf
Comment thread modules/local/spatialdata/write/main.nf
Comment thread modules/local/spatialdata/write/main.nf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants