Skip to content

Commit a9a2464

Browse files
release v0.1.0
1 parent 70e5b06 commit a9a2464

File tree

9 files changed

+113
-62
lines changed

9 files changed

+113
-62
lines changed

CHANGELOG.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
## [0.1.0] - tbd
1+
## [0.1.0] - 2024-09-11
22

3-
tbd
3+
First official `novae` release. Preprint coming soon.

data/README.md

+11-4
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,15 @@
11
# Public datasets
22

3-
We detail below how to download public spatial transcriptomics datasets. The data will be saved in this directory, and will be used to train `novae`.
3+
We detail below how to download public spatial transcriptomics datasets.
44

5-
## Download
5+
## Option 1: Hugging Face Hub
6+
7+
We store our dataset on [Hugging Face Hub](https://huggingface.co/datasets/MICS-Lab/novae).
8+
To automatically download these slides, you can use the [`novae.utils.load_dataset`](https://mics-lab.github.io/novae/api/novae.utils/#novae.utils.load_dataset) function.
9+
10+
NB: not all slides are uploaded on Hugging Face yet, but we are progressively adding new slides. To get the full dataset right now, use the "Option 2" below.
11+
12+
## Option 2: Download
613

714
For consistency, all the scripts below need to be executed at the root of the `data` directory (i.e., `novae/data`).
815

@@ -50,15 +57,15 @@ All above datasets can be downloaded using a single command line. Make sure you
5057
sh _scripts/1_download_all.sh
5158
```
5259

53-
## Preprocess and prepare for training
60+
### Preprocess and prepare for training
5461

5562
The script bellow will copy all `adata.h5ad` files into a single directory, compute UMAPs, and minor preprocessing. See the `argparse` helper of this script for more details.
5663

5764
```sh
5865
python _scripts/2_prepare.py
5966
```
6067

61-
## Usage
68+
### Usage
6269

6370
These datasets can be used during training (see the `scripts` directory at the root of the `novae` repository).
6471

docs/api/novae.plot.md

+2
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,5 @@
55
::: novae.plot.pathway_scores
66

77
::: novae.plot.paga
8+
9+
::: novae.plot.spatially_variable_genes

docs/tutorials/main_usage.ipynb

+71-52
Large diffs are not rendered by default.

novae/plot/_bar.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
from ._utils import get_categorical_color_palette
1111

1212

13-
def domains_proportions(adata: AnnData | list[AnnData], obs_key: str | None, figsize: tuple[int, int] = (2, 5)):
13+
def domains_proportions(adata: AnnData | list[AnnData], obs_key: str | None = None, figsize: tuple[int, int] = (2, 5)):
1414
"""Show the proportion of each domain in the slide(s).
1515
1616
Args:

novae/plot/_graph.py

+3
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,9 @@ def _domains_hierarchy(
8282
def paga(adata: AnnData, obs_key: str | None = None, **paga_plot_kwargs: int):
8383
"""Plot a PAGA graph.
8484
85+
Info:
86+
Currently, this function only supports one slide per call.
87+
8588
Args:
8689
adata: An AnnData object.
8790
obs_key: Name of the key from `adata.obs` containing the Novae domains. By default, the last available domain key is shown.

novae/plot/_heatmap.py

+3
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,9 @@ def pathway_scores(
7171
) -> pd.DataFrame | None:
7272
"""Show a heatmap of pathway scores for each domain.
7373
74+
Info:
75+
Currently, this function only supports one slide per call.
76+
7477
Args:
7578
adata: An `AnnData` object.
7679
pathways: Either a dictionary of pathways (keys are pathway names, values are lists of gane names), or a path to a [GSEA](https://www.gsea-msigdb.org/gsea/msigdb/index.jsp) JSON file.

novae/plot/_spatial.py

+19-2
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ def domains(
3434
Info:
3535
Make sure you have already your Novae domains assigned to the `AnnData` object. You can use `model.assign_domains(...)` to do so.
3636
37-
3837
Args:
3938
adata: An `AnnData` object, or a list of `AnnData` objects.
4039
obs_key: Name of the key from `adata.obs` containing the Novae domains. By default, the last available domain key is shown.
@@ -113,7 +112,25 @@ def spatially_variable_genes(
113112
min_positive_ratio: float = 0.05,
114113
return_list: bool = False,
115114
**kwargs: int,
116-
) -> list[str]:
115+
) -> None | list[str]:
116+
"""Plot the most spatially variable genes (SVG) for a given `AnnData` object.
117+
118+
!!! info
119+
Currently, this function only supports one slide per call.
120+
121+
Args:
122+
adata: An `AnnData` object corresponding to one slide.
123+
obs_key: Key in `adata.obs` that contains the domains. By default, it will use the last available Novae domain key.
124+
top_k: Number of SVG to be shown.
125+
show: Whether to show the plot.
126+
cell_size: Size of the cells or spots (`spot_size` argument of `sc.pl.spatial`).
127+
min_positive_ratio: Genes whose "ratio of cells expressing it" is lower than this threshold are not considered.
128+
return_list: Whether to return the list of SVG instead of plotting them.
129+
**kwargs: Additional arguments for `sc.pl.spatial`.
130+
131+
Returns:
132+
A list of SVG names if `return_list` is `True`.
133+
"""
117134
assert isinstance(adata, AnnData), f"Received adata of type {type(adata)}. Currently only AnnData is supported."
118135

119136
obs_key = utils.check_available_domains_key([adata], obs_key)

pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[tool.poetry]
22
name = "novae"
3-
version = "0.0.5"
3+
version = "0.1.0"
44
description = "Graph-based foundation model for spatial transcriptomics data"
55
documentation = "https://mics-lab.github.io/novae/"
66
homepage = "https://mics-lab.github.io/novae/"

0 commit comments

Comments
 (0)