SURF

The Statistical Utility for RBP Functions (SURF) is an integrative analysis framework to identify alternative splicing (AS), alternative transcription initiation (ATI), and alternative polyadenylation (APA) events regulated by individual RBPs and elucidate protein-RNA interactions governing these events. We used SURF to analyzed 104 RBP data (K562 cells, available from ENCODE).

A detailed vignette is available here.

Installation

You can install the development version of surf from GitHub with:

# install.packages("devtools")
devtools::install_github("fchen365/surf")

What can you do with SURF?

SURF is versatile in handling ATR event-centric analysis. Provided the data, here are four different things you could do with SURF.

	Data	Format	Task
1	genome annotation	any (gtf, gff, …)	parse ATR events
2	+ RNA-seq	alignment (bam)	detect differential ATR events
3	+ CLIP-seq	alignment (bam)	detect functional association
4	+ external RNA-seq	summarized table	differential transcriptional activity

SURF Pipeline

— One task at one call

The four tasks of SURF pipeline should be streamlined. Once you have the data in hand (see the following sub-section), each step can be performed with a single function:

library(surf)

event <- parseEvent(anno_file)                              # task 1
drr <- drseq(event, rna_seq_sample)                         # task 2
far <- faseq(drr, clip_seq_sample)                          # task 3
dar <- daseq(far, getRankings(exprMat), ext_sample)         # task 4

Here, anno_file, rna_seq_sample, clip_seq_sample, and ext_sample are data description, and exprMat is a table of extra transcriptome quantification (e.g., TCGA, GTEx, …).

— Tell `surf` about your data

Describing your data should be easy. Simply follow the example below.

For task 1, a file directory will do.

anno_file <- "gencode.v24.annotation.filtered.gtf"

For task 2, surf needs to know where the alignment files (bam) are and the experimental condition for differential analysis (e.g., RBP “knock-down” and “wild-type” control).

rna_seq_sample <- data.frame(
  row.names = c('sample1', 'sample2', 'sample3', 'sample4'),
  bam = paste0("rna-seq/bam/sample", 1:4, ".bam"),
  condition = c('knock-down', 'knock-down', 'wild-type', 'wild-type'),
  stringsAsFactors = F
)

Similarly for task 3, surf needs to know where the alignment files (bam) are and the experimental condition (e.g., “IP” and the input control “SMI”).

rna_seq_sample <- data.frame(
  row.names = c('sample5', 'sample6', 'sample7'),
  bam = paste0('clip-seq/bam/', 5:7, '.bam'),
  condition = c('IP', 'IP', 'SMI'),
  stringsAsFactors = F
)

Finally, for task 4, surf assumes that you have transcriptome quantification summarized in a table exprMat, whose rows correspond to genomic features (e.g., genes, transcripts, …) and columns correspond to samples. You can use any your favorite measure (e.g. TPM, RPKM, …). Then, let surf know of the sample group (condition):

ext_sample <- data.frame(
  row.names = colnames(exprMat),
  condition = rep(c('TCGA', 'GTEx'), c(173, 337))
)

Reference

Chen, F., Keleş, S. SURF: integrative analysis of a compendium of RNA-seq and CLIP-seq datasets highlights complex governing of alternative transcriptional regulation by RNA-binding proteins. Genome Biol 21, 139 (2020). doi:10.1186/s13059-020-02039-7

Name	Name	Last commit message	Last commit date
Latest commit Fan Chen Create surf.Rproj Jun 14, 2021 b77fd94 · Jun 14, 2021 History 27 Commits
R	R	v0.99.0	Apr 10, 2021
inst	inst	v1.0.0	Mar 23, 2021
man	man	v0.99.0	Apr 10, 2021
tests	tests	init -- under-documented	Dec 6, 2019
vignettes	vignettes	add intermediate results	May 6, 2021
.Rbuildignore	.Rbuildignore	release v1.0	Dec 19, 2020
.gitignore	.gitignore	ignore Rproj	Jun 14, 2021
DESCRIPTION	DESCRIPTION	bump pkg version	May 7, 2021
LICENSE.md	LICENSE.md	usethis setup	Dec 6, 2019
NAMESPACE	NAMESPACE	v0.99.0	Apr 10, 2021
NEWS.md	NEWS.md	v0.99.0	Apr 10, 2021
README.md	README.md	Update README.md	May 6, 2021
surf.Rproj	surf.Rproj	Create surf.Rproj	Jun 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SURF

Installation

What can you do with SURF?

SURF Pipeline

— One task at one call

— Tell `surf` about your data

Reference

About

Releases

Packages

Languages

License

fchen365/surf

Folders and files

Latest commit

History

Repository files navigation

SURF

Installation

What can you do with SURF?

SURF Pipeline

— One task at one call

— Tell surf about your data

Reference

About

Topics

Resources

License

Citation

Stars

Watchers

Forks

Releases

Packages 0

Languages

— Tell `surf` about your data

Packages