-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Hi Ashley,
Congrats again on the paper and this beautiful package! :)
I was wondering if this is even possible with IdentifiHR? Basically, are we able to get probabilities at a cell (single cell data) or spot level (spatial data)? I know spatial and single cell data are quite sparse so probably pseudobulking is the way to go (perhaps at a clone level, or cell type level?), but had to ask you just in case it is possible.
If I use the test data supplied by the package, the processCounts function produces nicely 0 centered Z-scores and the probabilities are mix of HRD/HRP samples as expected:
raw counts
processCounts
processCounts boxplots (nicely centered at 0)
hrd probilities
Now, using spatial data at a spot level, this is what the results look like. I do get a message saying 60.1382488479263% of the 2604 genes required for IdentifiHR are present.. One thing that stands out is the Z-score scaling by processCounts is not centering the data at 0 anymore but at somewhere around 5.
raw counts
processCounts
processCounts boxplots (just showing the first 100 spots for clarity. Not centered at 0 anymore)
hrd probilities
As you can see, all the spots are deemed HRD. This is an HRP patient. I've tried with other samples as well and all of them are called HRD with high probabilities when running at a spot level. Perhaps this is not the intended use of IdentifiHR and the counts need to be pseudobulked (i.e. summed up as in the paper). Although, maybe there is something in the scaling such that if we can center it at 0 somehow, it could work??
if I pseudobulk this spatial data to the sample level, the probability is HRP which makes sense:
Would really appreciate any thoughts :)
Thanks!
Ahwan