Skip to content
Draft
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## NEW FUNCTIONALITY

* Add new method: CellMapper, which is a k-NN based approach to map cells across representations and can be used for label projection. Two versions are included here, one based on PCA or CCA embeddings (`linear`) and one based on an scvi embedding (`scvi`) (PR #22)
* Add MLflow-based methods: Geneformer, scGPT, scVI, TranscriptFormer, and UCE for label projection using pre-trained foundation models (PR #28)

## MAJOR CHANGES

Expand Down
50 changes: 50 additions & 0 deletions src/methods/geneformer_mlflow/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
__merge__: ../../api/base_method.yaml

name: geneformer_mlflow
label: Geneformer (MLflow)
summary: Geneformer is a foundational transformer model pretrained on a large-scale corpus of single cell transcriptomes to enable context-aware predictions in settings with limited data in network biology (MLflow)
description: |
Geneformer is a context-aware, attention-based deep learning model pretrained
on a large-scale corpus of single-cell transcriptomes to enable
context-specific predictions in settings with limited data in network biology.

This version uses a pre-trained MLflow model. A kNN classifier is trained on
embeddings for the training data and used to predict labels for the test
data. It does not use the built-in Geneformer classifier.
references:
doi:
- 10.1038/s41586-023-06139-9
- 10.1101/2024.08.16.608180
links:
documentation: https://geneformer.readthedocs.io/en/latest/index.html
repository: https://huggingface.co/ctheodoris/Geneformer

info:
preferred_normalization: counts

arguments:
- name: --model
type: file
description: |
An MLflow model URI for the Geneformer model. If it is a .zip or
.tar.gz file it will be extracted to a temporary directory.
required: true

resources:
- type: python_script
path: script.py
- path: /src/utils/exit_codes.py
- path: /src/utils/unpack.py
- path: /src/utils/mlflow.py
- path: requirements.txt

engines:
- type: docker
image: openproblems/base_pytorch_nvidia:1
__merge__: /src/utils/mlflow_docker_setup.yaml

runners:
- type: executable
- type: nextflow
directives:
label: [hightime, highmem, midcpu, gpu]
Loading