Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading CosMx dataset #1102

Open
StevenWijnen opened this issue Feb 24, 2025 · 8 comments
Open

Loading CosMx dataset #1102

StevenWijnen opened this issue Feb 24, 2025 · 8 comments
Assignees

Comments

@StevenWijnen
Copy link

StevenWijnen commented Feb 24, 2025

First of all thanks for creating such a comprehensive package and extensive documentation!

I have a question regarding the loading of CosMx dataset using Giotto. First of all when running the following code:

cosmx <- createGiottoCosMxObject( cosmx_dir = data_path, FOVs = seq_len(25), load_expression = TRUE, load_cellmeta = TRUE, expression_path = paste0(data_path, "Run5469_S1_exprMat_file.csv"), metadata_path = paste0(data_path, "Run5469_S1_metadata_file.csv"), verbose = TRUE )

The expression matrix of the cosmx object remains empty and I can not do any analysis using the expression matrix provided. Furthermore, when following the CosMx tutorial on the website and asssigning transcripts using the inbuild methods I am getting way to many empty cells when I for example compare it with loading the same dataset in Seurat.

I was wondering if you could help me with the reason why the expression dataset not loaded when I force it to load it by setting load_expression and load_cellmeta.

Thanks in advance for the help!

R.version.string
[1] "R version 4.4.2 (2024-10-31)"

packageVersion("Giotto")
[1] ‘4.2.1’

Giotto::checkGiottoEnvironment()
Giotto can access environment found at:
 '~/miniforge3/envs/giotto_env/bin/python'
 If this is the wrong environment, try specifying `envname` param
 or set option "giotto.py_path" with the desired envname or path
[1] TRUE


@RubD
Copy link
Collaborator

RubD commented Feb 24, 2025

@jiajic might be able to help with this issue.

@StevenWijnen can you share which version of the CosMx platform you used to generate the data?

@RubD
Copy link
Collaborator

RubD commented Feb 24, 2025

@StevenWijnen can you reproduce this problem with a publicly available dataset? Otherwise it's difficult for us to figure out how to fix it.

@StevenWijnen
Copy link
Author

The dataset I am using this dataset: https://datadryad.org/stash/dataset/doi:10.5061/dryad.ksn02v7b1

I am still not able to force to load the expression matrix. However I do think I now know why the fucntions calculateOverlapRaster and overlapToMatrix result in so many empty cells. When I plot the expression it seems that the transcripts have a weird offset in the y directions (image 1). This is not seen when I only plot the negative probes (image2).

I think this might be some weird artefact of the dataset. So this is way I would like to force the object to just load the expression matrix, as I've seen in Seurat for example that this is in fact overlapping nicely with the images.

Image

Image

@jiajic
Copy link
Member

jiajic commented Feb 24, 2025

Hi @StevenWijnen,

Thanks for reporting this!
It seems createGiottoCosMxObject() is not passing the load params for expression, cell metadata, and transcripts to the underlying importer utility. This will be fixed soon.

About the mismatch between images and vector data: this is interesting and hard to solve. The outputs seem to sometimes be shifted one above, and sometimes not in different datasets. There doesn't seem to be a clear indication which it should be from the provided data. The current default orientations used are those compatible with the recent lymph and pancreas CosMx datasets.

For this dataset, you can get around the above issues by using the import utility:

f <- "filepath to unzipped project directory"
x <- importCosMx(f)
x$fovs <- 1:5 # load in subset (optional)

g <- giotto() |>
    setGiotto(x$load_transcripts()) |>
    setGiotto(x$load_polys(shift_vertical_step = 1)) |>
    setGiotto(x$load_expression()) |>
    setGiotto(x$load_cellmeta()) |>
    setGiotto(x$load_images(negative_y = FALSE))

# check data is in the right place (this will take a while to plot if not just a subset)
plot(ext(g))
lapply(g@images, plot, add = TRUE)
plot(g@feat_info$rna, dens = TRUE)
plot(g@spatial_info$cell, border = "magenta", lwd = 0.1, add = TRUE)

Image

Best,
George

@jiajic
Copy link
Member

jiajic commented Feb 24, 2025

Forgot to mention, sometimes there were some Thumbs.db files in the image folders. Please remove those or move them. They currently interfere with the image loading step since no other files were expected in the image subdirectories.

@StevenWijnen
Copy link
Author

Hi @jiajic,

Thank you for your thorough and thoughtful response!
A fix for the data loading would be really nice and I’m excited to test it out. Your code snippet for resolving the offset issue worked perfectly—thank you for sharing that! It might also be nice to put this snippet somewhere in the documentation, I spent quite some time for a fix to set the negative_y when callingcreateGiottoCosMxObject().
While the unexpected behavior is puzzling. If I uncover any additional details about its root cause during my testing, I’ll be sure to pass them along to assist with troubleshooting.

Thanks again for your support and expertise!

@StevenWijnen
Copy link
Author

One last note:
The spatInSituPlotPoints(...) still seems to plot the transcripts or image filpped. Whereas the plot code you've provided works fine and the mapping of transcripts now as well.

@jiajic
Copy link
Member

jiajic commented Feb 25, 2025

Taking another look, it looks like this dataset follows the legacy data formatting. It should already load correctly if you use

g <- createGiottoCosMxObject("....Run5469_S1", version = "legacy", slice = 1)

We currently only support versions: v6 (which should be the newest output) and legacy which was built around the formatting of the legacy lung NSCLC dataset.

The differences between the versions is just to change the image negative_y mapping and also to change the cols to use for the transcripts additional metadata.

It's hard to guess if a dataset matches a previous version, so I'll also add in the next update a image_negative_y toggle to createGiottoCosMxObject() that will affect image (and mask image derived polygon) mapping.


I'm surprised about the spatInSituPlotPoints() still being an issue.
I tried using the legacy loading and got this which seems correct.

test <- createGiottoCosMxObject("../../local_items/nanostring/gh_issue/Run5469_S1/",
    FOVs = c(11, 12, 13), 
    load_expression = T, 
    load_cellmeta = T, 
    version = "legacy"
)
spatInSituPlotPoints(test, 
    show_image = T, 
    image_name = c("composite_fov011", "composite_fov012", "composite_fov013"), 
    feats = list(rna = "CCL8"), 
    use_overlap = FALSE, 
    polygon_alpha = 0, 
    polygon_line_size = 0.1
)

Image

Were you loading the single channel images? I haven't taken a closer look at those yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants