- Enable support bcbio integration, by modifying
bcbioRNASeq()constructor to work with minimal bcbio test data. - Switched from internal usage of
aes_()in favor of consistent usage ofaes_string(). This will make the transition to [ggplot2][] v2.3.0 easier in a future update.
aggregateReplicates()support has been added back. This function returns aRangedSummarizedExperimentinstead of abcbioRNASeqobject, containing only an aggregate raw counts matrix in thecountsslot ofassays().- The functional analysis [R Markdown][] template has been reworked to use
dds_fileandorganismas new parameter arguments. We've reduced the number of parameters required here to run [clusterProfiler][]. - Made
alphaSummary()defunct forbcbioRNASeqobject, in favor ofDESeqDataSetonly. This function is only useful when a proper design formula has been defined.
metrics()now contains an informative error for datasets that were analyzed using thefast-rnaseqbcbio pipeline.DESeqDataSetcoercion frombcbioRNASeqobject doesn't attempt to runDESeq()command any more, which was unnecessary and improves speed.bcbioSingleCell()constructor now supportscensorSamplesparameter. This is useful for removing known poor quality samples upon loading.- [ggplot2][] color and fill palettes are now set
NULLin the quality control functions. This behavior doesn't change the appearance of the plot colors, which will still default toggplot2::scale_colour_hue()orggplot2::scale_fill_hue(). The upcoming [ggplot2][] v2.3.0 update supports global options for color and fill palettes, so these parameters may be deprecated in a future release. - Reworked the internal code for
topTables().
- Added [macOS][] testing to [Travis CI][] build checks.
- Fixed [clusterProfiler][] compilation error on [Travis CI][] by installing
libudunits2-dev(Linux).
- Now recommending variance stabilizing transformation (
vst) overrlogcounts by default in plots, where applicable.
- Tweaked Rory's biotype plots in QC report to match formatting conventions in the package. These plots are now colored.
- Added
plotDEGPCA()to default differential expression R Markdown template. colData()factors are correctly releveled upon object subset with[. This helps avoid unwanted downstream errors when creating aDESeqDataSetand running differential expression with DESeq2.- Recommending
facetreturn method by default forplotGene(). Updated the working example to reflect this. metrics()now returnsinterestingGroupscolumn.samplelabel has been removed from axis title for QC plot functions.- Now using shared ggplot2 convneience functions from bcbioBase 0.2.10:
bcbio_geom_abline(),bcbio_geom_label(), andbcbio_geom_label_repel(). These are also used by bcbioSingleCell for improved graphical consistency. - Removed unused internal legacy ggplot2 code.
- Increased DEGreport, DESeq2, and tximport dependency requirements.
- Split out assertive imports so we can pin on [bioconda][].
- Improved package documentation.
- Improved label consistency in
plotPCA()functions to matchplotMeanAverage()andplotVolcano(). - Improved automatic title labeling in
plotDEGPCA(), matching the other DEG functions. Also added directionality toplotDEGPCA(). - Added DESeqDataSet method support to
plotCorrelationHeatmap(), using the normalized counts. reusltsTables()now writes local files totempdir()when Dropbox mode is enabled usingdropboxDir.
Last set of code fixes before F1000v2 resubmission.
- Added
rlereturn support forcounts(), which are calculated on the fly. - Added
transgeneNamesandspikeNamessupport toloadRNASeq()function. loadRNASeq()now supportsorganism = NULLagain, for datasets with poorly annotated genomes.- Primary
assay()containing raw counts is now namedcountsinstead ofraw, for consistency with otherSummarizedExperimentobjects (e.g.DESeqDataSet) and the bcbioSingleCell S4 class definition. - Improved internal code for
plotGene()andplotGenderMarkers().
- Improved AppVeyor CI support to test against bioc-devel using R 3.5.
- Improved support and unit testing for
updateObject()method. - DESeq2 normalized counts are always slotted in
assays(), even when rlog and vst transformations are skipped. - Exporting
[[<-,assays<-,colData<-,interestingGroups<-, andmetadata<-assignment methods, to avoid unwanted coercion toSummarizedExperiment. Objects extendingRangedSummarizedExperimentshouldn't be doing this, so we may need to file a bug report with Bioconductor or check our class definition in the package. - Now importing specific functions from S4Vectors and methods rather than importing everything.
- Switched back to using
stop(),warning()andmessage()rather than the alternate rlang functionsabort(),warn(), andinform(). - Objects with invalid metadata now print which slots are invalid to the console.
bcbioRNASeqS4 class object is now extendingRangedSummarizedExperimentinstead ofSummarizedExperiment. Consequently, the row annotations are now stored in therowRangesslot asGRangesclass, instead of in therowDataslot as aDataFrame. TherowData()accessor still works and returns a data frame of gene/transcript annotations, but these are now coerced from the internally storedGRanges. TheGRangesobject is acquired automatically from [Ensembl][] usingbasejump::ensembl(). By default,GRangesare acquired from [Ensembl][] using AnnotationHub and [ensembldb][]. Legacy GRCh37 genome build is supported using the [EnsDb.Hsapiens.v75][] package.assays()now only slot matrices. We've moved the [tximport][] data from the now defunctbcbio()slot to assays. This includes thelengthsmatrix from [tximport][]. Additionally, we are optionally slotting [DESeq2][] variance-stabilized counts ("rlog","vst"). [DESeq2][] normalized counts and [edgeR][] TMM counts are calculated on the fly and no longer stored inside thebcbioRNASeqobject.colData()now defaults to returning asdata.frameinstead ofDataFrame, for easy piping to [tidyverse][] functions.bcbio()slot is now defunct.- FASTA spike-ins (e.g. EGFP, ERCCs) can be defined using the
isSpikeargument during theloadRNASeq()data import step. - Melted counts are now scaled to log2 in the relevant quality control
functions rather than using log10. This applies to
plotCountsPerGene()andplotCountDensity(). Note that we are subsetting the nonzero genes as defined by the raw counts here. - Simplified internal
tximport()code to no longer attempt to strip transcript versions. This is required for working with C. elegans transcripts. - Minimal working example dataset is now derived from GSE65267, which is also used in the F1000 paper.
- Added
as(object, "DESeqDataSet")coercion method support forbcbioRNASeqclass. This helps us set up the differential expression analysis easily. counts()function now returns [DESeq2][] normalized counts (normalized = TRUE) and [edgeR][] TMM counts (normalized = "tmm") on the fly, as suggested by the F1000 reviewers.- Design formula can no longer be slotted into
bcbioRNASeqobject, since we're not stashing aDESeqDataSetany more. - Updated Functional Analysis R Markdown template.
validObject()is now required for all plotting functions. This check is also called in the R Markdown template. Legacy objects can be updated usingupdateObject().metrics()now returns columns sorted alphabetically.- Added
contrastName()as a generic function. plotDEGHeatmap()andplotDEGPCA()generics no longer havecountsdefined in the signature. Thecountsargument is now only defined in the methods.prepareRNASeqTemplate()has been converted from a generic to a standard function.- Improved
metadata()validity checks. plotCorrelationHeatmap()matrix method has been moved to basejump package, for improved consistency with the other heatmap code.plotGenderMarkers()internal code has been reworked to matchplotGene().- Default
plotMA()appearance has changed, providing a line at the 0 y-intercept, similar toDESeqDataSetmethod. - Internal example datasets have been renamed (e.g.
bcb_smallinstead ofbcb). - Added AppVeyor CI support for code testing on Windows.
- Made Travis CI checks stricter, added
BiocCheck(). - Internal
.sampleDirs()code is now exported in bcbioBase as a generic. gene2symbol()andinterestingGroups()method support are now defined forSummarizedExperimentin the bcbioBase package.
- Use
updateObject()in combination with therowRangesargument, which requires aGRangesobject.GRangescan be obtained from [Ensembl][] using thebasejump::ensembl()function or the [ensembldb][] package.
bcbio()slot is now defunct, since we have moved all data into theSummarizedExperimentcontainer.- Deprecated
plot5x3Bias()in favor ofplot5Prime3PrimeBias(). This is less confusing as to what this function plots. flatFiles()has been deprecated in favor ofas(object, "list")coercion method. See bcbioBase package forSummarizedExperimentmethod support.- Defunct:
design(),download(),meltLog10(),txi(). - Legacy
bcbioRNADataSetmethod support has been removed.
- Bug fix for
gene2symbolargument not renaming rows inplotDEGHeatmap().
- Bug fix for
[subset method dropping metrics in metadata. - Simplified unit testing for Dropbox mode enabled in
resultsTables().
- Bug fix for gene-to-symbol mappings in
plotDEGHeatmap(). - Added support for quickly plotting differentially expressed genes (DEG) in
a PCA plot with
plotDEGPCA(). - Added support for Dropbox shared links to
resultsTables(), for use with the [Stem Cell Commons][] database. - Added assert checks internally for all functions.
- Improved internal code for
plotGene()andplotGenderMarkers()to run faster. - Deprecated data frame methods based on metrics for QC functions.
- Import shared dependency functions from bcbioBase instead of basejump.
- Added method support for
selectSamples(). organismandgenomeBuildparameters are now user-definable in the mainloadRNASeq()import function.- Fixed gene subsetting method on S4 object, which handles genes using
intersect()in the featureCounts matrix. - Removed internal
aggregateReplicates()code. This needs to be reworked and added back in a future release. - Improve method for handling a missing normalized counts matrix in the assays slot. This can occur when the user opts to skip the CPU-intensive DESeq2 normalizations.
- Improved internal code for the quality control functions. Improved the
ifstatements to be more class specific. - Renamed
plotCorrelationHeatmap()transformargument tonormalized, for consistency with thecounts()generic. - Added
titlesupport to plots, where applicable. - Updated internal code for
plotDEGHeatmap(). - Updated internal marker handling code for
plotGenderMarkers(). resulsTables()function now defaults tosummary = TRUE.
- Migrated all basejump function imports to bcbioBase package.
- Combined examples (
bcb,dds,res, etc.) into a singleexamplesobject. This helps avoid accidental use of examplebcbin an analysis. - Moved ggplot imports from
internal-ggplot.Rto above each function. - Renamed
maxSamplesparameter inloadRNASeq()totransformationLimit. If there are more samples than this limit, then the DESeq2 transformations will be skipped. In this case,rlogandvstwill not be slotted intoassays(). - Added a colData sanitization step in
loadRNASeq()to ensure rows are in the same order as the columns in the counts matrix. Otherwise, DESeq will report an error at theDESeqDataSetFromTximport()step. We're also ensuring the factor levels get updated here. - Now using
glimpse()instead ofstr()in examples, where applicable. - Added
colData<-assignment method support. This requires aDataFrameclass object. Upon assignment, the internal colData atbcbio(object, "DESeqDataSet"),assays(object)[["rlog"]]andassays(object)[["vst"]]are also updated to match. - Initial assignment support for
design, which will update the internal DESeqDataSet. - Added method support for
gene2symbol()generic, which will now return a 2 columndata.framewithensgeneandsymbolcolumns. This is helpful for downstream gene to symbol mapping operations. - Added working example for
interestingGroups<-in the documentation. - Added some code to improve factor releveling, where applicable. See
internal-meltLog10.Rfor example. - Now explicitly defining the custom color palettes (e.g.
viridis::scale_fill_viridis(discrete = TRUE). This makes it clearer to the user in the documentation where these palettes are located. - Improved axis label support in
plotGene(). plotHeatmap()now uses internalgene2symbolmappings from stashed annotable, instead of always querying Ensembl. The user can define custom mappings with thegene2symbolargument, if desired.plotPCA()now supports custom color palettes. Theshapesparameter has been removed because it doesn't work well and is limited to datasets with few samples. This behavior matches the PCA functionality in DESeq2.- Improved internal code for
plotVolcano(). Added support forgene2symbolargument, like inplotHeatmap(). If left missing, the function will query Ensembl for the gene2symbol mappings. We're now usingdatainstead ofstatsas the main data source. - Improved legibility of subset method code.
- Added some additional reexports, which are used for the package documentation and website.
- Simplified legacy object coercion method code.
- Updated Bioconductor installation method code. We're now using the
dependenciesargument, which allows for automatic install of suggested packages along with imports.
- Updated package imports to match Bioconductor 3.6.
- Added support for interesting groups assignment with
interestingGroups<-. - Renamed
plotGeneHeatmap()to simplyplotHeatmap(). - Added gender marker support for Homo sapiens.
- Improved support for multiple interesting groups in quality control plots. Now interestingGroups is defined as a column in the metrics data.frame that is used to specify the plot color/fill. This matches the convention in the bcbioSingleCell 0.0.22 update.
- Sample metadata columns are now consistently set as factors.
- Added support for coloring of multiple interesting groups in quality control plots.
- Updated version and author information to match the F1000 Research workflow.
- Added an
f1000v1branch containing the reproducible code used to generate the figures in our workflow. - Modified
plotMA()to support vertical or horizontal layout return. Also added an argument to remove the color legend, which is typically not that informative. - Added custom color palette support to the quality control functions.
- Upgrading from
bcbioRNADataSet(< 0.1.0) tobcbioRNASeqclass object is now possible usingas()coercion method. - Object oriented methods are now restricted to use
bcbioRNASeqobject. LegacybcbioRNADataSetobjects must be upgraded tobcbioRNASeqclass.
- Added support for output of unstructured data inside
bcbioRNASeqS4 object usingflatFiles()function. - Added
bcbioRNASeqmethod support forannotable()generic.
- Renamed
bcbioRNADataSetS4 class tobcbioRNASeq. This matches the naming conventions in the bcbioSingleCell package. - Renamed
loadRNASeqRun()to simplyloadRNASeq(). - Switched
loadRNASeq()from using S4 dispatch to a standard function. - Added a parameter argument to
loadRNASeq()that enables request of a specific Ensembl release version for gene annotations. - Renamed
interestingGroupargument in quality control functions tointerestingGroupsfor better consistency. - Improved handling of sample metrics in
plotPCACovariates(). - Added functional analysis R Markdown template.
- Offloaded some core functionality shared between [bcbioRNASeq][] and bcbioSingleCell to the basejump package. This included some code to handle sample metadata YAML and file loading. This helps provide a consistent experience across both packages.
- Renamed package from bcbioRnaseq to bcbioRNASeq.
- Improved website appearance.
- Added [viridis][] color palette support to quality control functions.
- Improved subset operations on
bcbioRNADataSetobject. - Fixed setup chunk loading of
bcbioRNADataSetin differential expression [R Markdown][] template.
- Added S4 methods support for plots, allowing the user to use either
bcbioRNADataSetor a metricsdata.frameand manualinteresting_groupdeclaration for visualization. - Migrated function and variable names from
snake_casetocamelCase. - Offloaded small RNA functionality to a separate package named bcbioSmallRNA.
- Reworked [R Markdown][] templates to improve YAML defaults and add more comments.
- Modified default path variables in
setup.Rto use*_dirinstead of*_out. - Updated NEWS file to use [Markdown][] syntax.
- Slotted
DESeqDataSetusingdesign = formula(~1)for quality control. This enables automatic generation ofrlogandvsttransformed counts. - Documentation fixes and website updates.
- Renamed S4 class from
bcbioRnaDataSettobcbioRNADataSet(case sensitive). - Adjusted the number of exported functions.
- Added [testthat][] checking with [lintr][].
- Initial setup of code coverage using [covr][].
- Prepared draft of [F1000][] workflow document.
- Added [Travis-CI][] support for automatic rendering of quality control report.
bcbioRnaDataSetS4 definition updates.- Updates to
plot_pca()and gene-level heatmaps.
- Simplified count pooling functions.
- Reduced number of exports and improved documentation.
- Draft migration of bcbio run object into S4
bcbioRnaDataSet. - Created a new variant of
load_run()that saves to S4 object instead of list.
- Reworked and re-organized internal functions.
- Defaulted to loading run using project summary YAML file.
- Initial commit of [R Markdown][] templates (e.g. quality control).
- Added support for dynamic file downloads from [HBC][] website.
- Draft build of website using
pkgdown::build_site().
- Improved [RDAVIDWebService][] utility functions to work with [dplyr][] 0.6.0.
- Reworked metadata and summary metrics functions to obtain information from
project-summary.yamlsaved in the final run directory.
- Reduced number of depdencies.
- Initial commit of modified volcano plot from CHBUtils package.
- Internal code updates for upcoming [dplyr][] 0.6.0/[tidyeval][] update.
- Updated [Ensembl][] biomaRt annotations to use live site, currently release 88.
- Renamed
import_*functions toread_*.
- Consolidated NAMESPACE imports.
- Defaulted to writing count matrices with gzip compression, to save disk space.
- Renamed internal parameters for better readability.
- Improved documentation and consolidate functions by group.
- NAMESPACE simplification using basejump package.
- Reworked handling of plots and tables during knits.
- Initial commit of differential expression and gene set enrichment functions.
- Added bcbio object integrity checks.
- Improved detection and handling of lane split samples.
- Reworked functions to utilize bcbio list object.
- Added plotting functions.
- Start of package development.
- Initial draft release supporting automatic loading of bcbio run data.
[clusterProfiler]: [covr]: https://github.com/jimhester/covr [dplyr]: http://dplyr.tidyverse.org [edgeR]: https://doi.org/doi:10.18129/B9.bioc.edgeR [Ensembl]: http://www.ensembl.org [ensembldb]: https://doi.org/doi:10.18129/B9.bioc.ensembldb [F1000]: https://f1000.com [HBC]: http://bioinformatics.sph.harvard.edu [lintr]: https://github.com/jimhester/lintr [macOS]: https://www.apple.com/macos [Markdown]: https://daringfireball.net/projects/markdown/syntax [RDAVIDWebService]: https://doi.org/doi:10.18129/B9.bioc.RDAVIDWebService [R Markdown]: http://rmarkdown.rstudio.com [Stem Cell Commons]: http://stemcellcommons.org [testthat]: https://github.com/hadley/testthat [tidyeval]: http://dplyr.tidyverse.org/articles/programming.html [tidyverse]: https://www.tidyverse.org/ [tximport]: https://doi.org/doi:10.18129/B9.bioc.tximport [Travis-CI]: https://travis-ci.org [viridis]: https://cran.r-project.org/web/packages/viridis/index.html