Linseed (LINear Subspace identification for gene Expresion Deconvolution) is a package that provides tools and interface to explore gene expression datasets in linear space.
The current version of this tool is stable and should work as expected. We will continue to maintain it to ensure it remains functional, addressing only critical bugs as needed. However, no new features will be added in this repository.
For enhanced functionality and continued development of deconvolution methods, please refer to our new tool, DualSimplex, which represents the next step in this line of work.
You can install the package using devtools::install_github
:
devtools::install_github("ctlab/linseed")
Current build was tested using rhub:
To start working with gene expression data, we need to create a new LinseedObject, in this tutorial we will use GSE19830 (mixture of Liver, Brain and Lung), we will take only mixed samples (10-42) and will take only 10000 most expressed genes.
library(linseed)
lo <- LinseedObject$new("GSE19830", samples=10:42, topGenes=10000)
To build a coolinearity network we first have to evaluate all pairwise collinearity coefficients, all pairwise spearman correlation and then run significance test which will calculate p value for each each by shuffling network weights randomly.
lo$calculatePairwiseLinearity()
lo$calculateSpearmanCorrelation()
lo$calculateSignificanceLevel(100)
lo$significancePlot(0.01)
lo$filterDatasetByPval(0.01)
## Total number of genes is 10000
## The number of genes after filtering is 3297
lo$svdPlot()
To visualiaze what left after filtering we can call projection plot from our object. But we have to project the data to the simplex first.
lo$setCellTypeNumber(3)
lo$project("full") # projecting full dataset
lo$projectionPlot(color="filtered")
To deconvolve the dataset, you first have to project (full or filtered dataset) to the simplex, and then find corners of it.
lo$project("filtered")
lo$smartSearchCorners(dataset="filtered", error="norm")
## Final vector is
## 4 1 5
##
lo$deconvolveByEndpoints()
plotProportions(lo$proportions)
We can also use tSNE to haave an idea of how data looks like when dimensionally reduced.
# lets select 100 genes closest to the simplex corners
lo$selectGenes(100)
lo$tsnePlot()
To compare with actual proportions you can use dotPlotProportions
function
data("proportionsLiverBrainLung")
dotPlotPropotions(lo$proportions, proportionsLiverBrainLung[, 10:42], guess=TRUE)