-
Notifications
You must be signed in to change notification settings - Fork 927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Evaluating Reference Data for Bulk RNA Deconvolution tutorial #5549
base: main
Are you sure you want to change the base?
New Evaluating Reference Data for Bulk RNA Deconvolution tutorial #5549
Conversation
…m/hexhowells/training-material into deconvolution-evaluation-tutorial
topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md
Outdated
Show resolved
Hide resolved
> - {% icon param-collection %} *"Expression Data"*: `Expression Data` | ||
> | ||
> {% snippet faqs/galaxy/workflows_run.md %} | ||
> 3. Add a tag labelled `#A` to the first "Actual cell proportions" and "Pseudobulk" collections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not completely sure, but I feel like the term "actual cell proportions" might be a little misleading. The cell proportions, as indicated by proportional representation in the single-cell data, are often different from the true in vivo cell type proportions due to systematic drop out biases during data collection. This might be worth mentioning, or maybe a different term which doesn't use "actual" could be substituted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the actual cell proportions for the single-cell data, which is the closest we can get to knowing the true cell proportions for any data. I think its probably the cleanest name to use but I will add a section to mention that these won't be a perfect representation of real cell proportions in vivo.
topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md
Outdated
Show resolved
Hide resolved
> > | ||
> > ![Scatter plot comparison](../../images/bulk-deconvolution-evaluate/scatterplot-compare.png "Scatter plot comparison between Music and NNLS") | ||
> > | ||
> > 1. Comparing scatter plots, the MuSiC tool has the most accurate results since the points fall closer onto the x=y line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine the case that the NNLS deconvolution more closely resembled the cell proportions in the real, biological context, while MuSic more accurately recapitulated with proportions from the single cell data. Which if these two methods are really more accurate, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At it's core, deconvolution tools are trying to determine the cell proportions of some bulk-RNA data, which would ideally represent the biological sample accurately. So the best that any tool can do is measure the data its been given, since any errors in the sequencing won't be known.
Realistically here, NNLS would be determined to be more accurate but without knowing the true cell proportions of the biological sample (which would kind of render deconvolutional tools useless), the best we can do is assume pseudobulk's from single-cell data are a good representation of actual bulk data. In which case its probably safe to assume that MuSiC would be more accurate.
Co-authored-by: Saskia Hiltemann <[email protected]>
topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Saskia Hiltemann <[email protected]>
@@ -507,6 +507,10 @@ Camila-goclowski: | |||
email: [email protected] | |||
linkedin: camila-goclowski | |||
|
|||
carloscheemendonca: | |||
name: Carlos Chee Mendonça | |||
joined: 2025-01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carloscheemendonca please feel free to edit or add more information about yourself to this entry as you see fit
…m/hexhowells/training-material into deconvolution-evaluation-tutorial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tests are still failing because the name of this file is expected to end in -test.yml
, so just renaming like deconv-eval-stage-1-create-data-test.yml
should fix that. And thanks for adding the testing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I renamed them before uploading and forgot to add that back in.
Also, the deconv-eval-stage-1-create-data_child
workflow is a sub-workflow used in the deconv-eval-stage-1-create-data
workflow. I'm not sure how I should add testing for that or if it's even needed here since I would guess it's part of the parent workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hexhowells shouldn't be necessary for the subworkflow no
also, my bad, it should be -tests.yml
(with the s)
New tutorial on evaluating reference data for bulk RNA deconvolution tools, evaluating both MuSiC and NNLS deconvolution tools within Galaxy.