Skip to content

Add new gene unification approach and testing #155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 89 commits into from
Apr 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
1882dc0
Implement load_h5ad subworkflow
nictru Apr 11, 2025
f635b44
Start working on new gene unification workflow
nictru Apr 11, 2025
541870f
Implement hugounifier apply
nictru Apr 11, 2025
e0273ae
Add ADATA_READCSV tests
nictru Apr 11, 2025
8188bcc
Add ADATA_TORDS tests
nictru Apr 11, 2025
74f83b8
Update ADATA_READCSV tests
nictru Apr 11, 2025
69e7fed
Add scanpy readh5 tests
nictru Apr 11, 2025
a3ba3f5
Add readRDS test
nictru Apr 11, 2025
b52d799
Move tests to dedicated directory
nictru Apr 11, 2025
1ed9994
Add load_h5ad test
nictru Apr 11, 2025
9cf8e13
Separate preprocess and load_h5ad
nictru Apr 11, 2025
2e72a02
Add adata_getsize tests
nictru Apr 11, 2025
a703e32
Update modules
nictru Apr 11, 2025
a55af70
Use shared empty droplet detection subworkflow
nictru Apr 11, 2025
673ed07
Add scanpy plotqc tests
nictru Apr 12, 2025
a98402c
Remove now-shared adata_barcodes module
nictru Apr 12, 2025
6b53e49
Add decontx tests
nictru Apr 12, 2025
22b097a
Add soupx tests
nictru Apr 12, 2025
217385c
Add ambient_rna_removal test
nictru Apr 13, 2025
afd3851
Add scanpy_pca module
nictru Apr 13, 2025
d4b9e9e
Add pca args parameter
nictru Apr 13, 2025
b1ab802
Add PCA file name collision detection
nictru Apr 13, 2025
87a8fc1
Add PCA tests
nictru Apr 13, 2025
2c43976
Add ambient RNA removal test
nictru Apr 13, 2025
83448e0
Set decontX batch_col
nictru Apr 13, 2025
2327e41
Rearrange code between load_h5ad, quality_control and main workflow
nictru Apr 13, 2025
edcd141
Rename preprocessing to QC
nictru Apr 13, 2025
77f8842
Update load_h5ad tests
nictru Apr 13, 2025
ee20182
Add more detailed load_h5ad tests
nictru Apr 13, 2025
b4df615
Update adata_getsize
nictru Apr 13, 2025
28b5d77
Update SCANPY_FILTER module
nictru Apr 13, 2025
0c793e3
Remove custom meta field dependency in SCANPY_FILTER
nictru Apr 13, 2025
3c7db1a
Use pyyaml in SCANPY_FILTER
nictru Apr 13, 2025
d9086d4
Add SCANPY_FILTER tests
nictru Apr 13, 2025
efcf4c1
Update SCDS module and add tests
nictru Apr 13, 2025
7db9bdb
Add SCANPY_SCRUBLET tests
nictru Apr 13, 2025
1b8274e
Add some checks for doublet detection workflow
nictru Apr 13, 2025
a515487
Add doublet detection tests
nictru Apr 13, 2025
d3276cc
Add QUALITY_CONTROL workflow tests
nictru Apr 13, 2025
78dc842
Add celltypist test
nictru Apr 13, 2025
c9f2168
Add CELLTYPE_ASSIGNMENT tests
nictru Apr 13, 2025
b148eda
Add test for hugounifier_get
nictru Apr 13, 2025
c26fdea
Add test for hugounifier_apply
nictru Apr 14, 2025
6725f23
Add unify_genes subworkflow test
nictru Apr 14, 2025
2f555a7
Update upsetgenes implementation
nictru Apr 14, 2025
f54b7b5
Add adata_upsetgenes test
nictru Apr 14, 2025
58428e9
Update ADATA_UNIFY implementation
nictru Apr 14, 2025
eae48c0
Further improve adata_unify
nictru Apr 14, 2025
4b7b245
Update symbol_col treatment in ADATA_UNIFY
nictru Apr 14, 2025
f38d6e9
Add ADATA_MYGENE process
nictru Apr 14, 2025
1b8b9a3
Add geneid_col and counts_layer to input schema
nictru Apr 14, 2025
84c3c40
Add ADATA_MYGENE tests
nictru Apr 14, 2025
9c7f1f0
Add ADATA_SETINDEX module
nictru Apr 14, 2025
6cfc8d7
Add tests for adata_unify
nictru Apr 14, 2025
cf0accc
Rename MERGE workflow to UNIFY
nictru Apr 14, 2025
a0fc5bc
Relocate ADATA_MERGE to main workflow
nictru Apr 14, 2025
75c927c
Add tests for unify subworkflow
nictru Apr 14, 2025
8cd23d9
Update ADATA_MERGE
nictru Apr 14, 2025
cb0af4b
Add tests for ADATA_MERGE
nictru Apr 14, 2025
f91b67a
Move ADATA_MERGE back into COMBINE workflow
nictru Apr 14, 2025
39f5e01
Update SCANPY_HVGS module
nictru Apr 14, 2025
b213dcb
Add tests for SCANPY_HVGS
nictru Apr 14, 2025
c353b8c
Add SCVI tests
nictru Apr 14, 2025
45d66c4
Add harmony tests
nictru Apr 14, 2025
ef490c9
Add combat and BBKNN tests
nictru Apr 14, 2025
4dfec4a
Add scimilarity subworkflow
nictru Apr 14, 2025
c3450e3
Fix minor pipeline bugs
nictru Apr 14, 2025
55e041a
Update some module configs
nictru Apr 14, 2025
1710098
Improve outdir structure
nictru Apr 14, 2025
ef81441
Improve channel naming in INTEGRATE workflow
nictru Apr 14, 2025
4cd2fe7
Rename preprocess_only to qc_only
nictru Apr 14, 2025
dd6ce55
Fix some things with seurat integration
nictru Apr 14, 2025
c64ec77
Editorconfig
nictru Apr 15, 2025
24fb404
Prettier
nictru Apr 15, 2025
79e0b55
Match schema with default configs
nictru Apr 15, 2025
c6125c2
Implement build and extend pipeline tests
nictru Apr 15, 2025
6c7dec4
Add sample to batch identifier
nictru Apr 15, 2025
7b6f8fb
Update batch_col default value
nictru Apr 15, 2025
c44ce4a
Add nf-test CI
nictru Apr 15, 2025
1b6b6dc
Revert "Update batch_col default value"
nictru Apr 15, 2025
a517345
Prettier
nictru Apr 15, 2025
e08033e
Add nf-test CI environmnet variables
nictru Apr 15, 2025
b305bc1
Set nf-test resource limits
nictru Apr 15, 2025
4a2f75f
Add LIANA_RANKAGGREGATE error handling
nictru Apr 15, 2025
5519918
Set geneid_col default value
nictru Apr 15, 2025
994a189
Increase CI resource limits
nictru Apr 15, 2025
fe5b223
Further increase test memory limits
nictru Apr 15, 2025
5416a0c
Increase nf-test resource limits one last time
nictru Apr 15, 2025
08be36f
Ignore LIANA CI errors
nictru Apr 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ on:

env:
NXF_ANSI_LOG: false
NFT_VER: "0.9.0"
NFT_WORKDIR: "~"
NFT_DIFF: "pdiff"
NFT_DIFF_ARGS: "--line-numbers --expand-tabs=2"
NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/.singularity
NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity

Expand Down Expand Up @@ -85,3 +89,81 @@ jobs:
- name: "Run pipeline with test data ${{ matrix.NXF_VER }} | ${{ matrix.test_name }} | ${{ matrix.profile }}"
run: |
nextflow run ${GITHUB_WORKSPACE} -profile ${{ matrix.test_name }},${{ matrix.profile }} --outdir ./results

nf-test:
name: "Run nf-test pipeline tests (${{ matrix.test_name }} | ${{ matrix.NXF_VER }} | ${{ matrix.profile }})"
if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/scdownstream') }}"
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
NXF_VER:
- "24.04.2"
- "latest-everything"
profile:
- "conda"
- "docker"
- "singularity"
test_name:
- "build"
- "extend"
isMaster:
- ${{ github.base_ref == 'master' }}
# Exclude conda and singularity on dev
exclude:
- isMaster: false
profile: "conda"
- isMaster: false
profile: "singularity"

steps:
- name: Disk space cleanup
uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1

- name: Check out pipeline code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4
with:
fetch-depth: 0

- uses: actions/setup-python@v4
with:
python-version: "3.11"
architecture: "x64"

- uses: actions/setup-java@8df1039502a15bceb9433410b1a100fbe190c53b # v4
with:
distribution: "temurin"
java-version: "17"

- name: Install pdiff to see diff between nf-test snapshots
run: |
python -m pip install --upgrade pip
pip install pdiff

- uses: nf-core/setup-nextflow@v2
with:
version: "${{ matrix.NXF_VER }}"

- uses: nf-core/setup-nf-test@v1
with:
version: ${{ env.NFT_VER }}

- name: Run nf-test
run: |
nf-test test \
--ci \
--junitxml=test.xml \
--profile ${{ matrix.profile }} \
tests/main_pipeline_${{ matrix.test_name }}.nf.test

- name: Output log on failure
if: failure()
run: |
sudo apt install bat > /dev/null
batcat --decorations=always --color=always ${{ github.workspace }}/.nf-test/tests/*/meta/nextflow.log

- name: Publish Test Report
uses: mikepenz/action-junit-report@v3
if: always() # always run even if the previous step fails
with:
report_paths: test.xml
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ testing/
testing*
*.pyc
null/
.nf-test*
31 changes: 22 additions & 9 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@
"batch_col": {
"type": "string",
"pattern": "^[a-zA-Z][a-zA-Z0-9_-]*$",
"default": "batch",
"errorMessage": "Batch column needs to start with a letter and can contain letters, numbers, underlines and dashes.",
"meta": ["batch_col"]
},
Expand All @@ -47,6 +46,13 @@
"errorMessage": "Symbol column needs to start with a letter and can contain letters, numbers, underlines and dashes.",
"meta": ["symbol_col"]
},
"geneid_col": {
"type": "string",
"pattern": "^[a-zA-Z][a-zA-Z0-9_-]*$",
"default": "index",
"errorMessage": "Gene ID column needs to start with a letter and can contain letters, numbers, underlines and dashes.",
"meta": ["geneid_col"]
},
"label_col": {
"type": "string",
"pattern": "^[a-zA-Z][a-zA-Z0-9_-]*$",
Expand All @@ -60,31 +66,38 @@
"errorMessage": "Unknown label needs to start with a letter and can contain letters, numbers, underlines and dashes.",
"meta": ["unknown_label"]
},
"counts_layer": {
"type": "string",
"pattern": "^[a-zA-Z][a-zA-Z0-9_-]*$",
"default": "X",
"errorMessage": "Counts layer needs to start with a letter and can contain letters, numbers, underlines and dashes.",
"meta": ["counts_layer"]
},
"min_genes": {
"type": "integer",
"minimum": 1,
"default": 1,
"minimum": 0,
"default": 0,
"errorMessage": "Minimum number of genes must be an integer greater than 0.",
"meta": ["min_genes"]
},
"min_cells": {
"type": "integer",
"minimum": 1,
"default": 1,
"minimum": 0,
"default": 0,
"errorMessage": "Minimum number of cells must be an integer greater than 0.",
"meta": ["min_cells"]
},
"min_counts_cell": {
"type": "integer",
"minimum": 1,
"default": 1,
"minimum": 0,
"default": 0,
"errorMessage": "Minimum number of counts per cell must be an integer greater than 0.",
"meta": ["min_counts_cell"]
},
"min_counts_gene": {
"type": "integer",
"minimum": 1,
"default": 1,
"minimum": 0,
"default": 0,
"errorMessage": "Minimum number of counts per gene must be an integer greater than 0.",
"meta": ["min_counts_gene"]
},
Expand Down
Loading