-
Notifications
You must be signed in to change notification settings - Fork 16
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit d1a503e
Showing
179 changed files
with
914,977 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# lncRNA-screen | ||
|
||
### Author: Yixiao Gong | ||
|
||
lncRNA-screen is a comprehensive pipeline for computationally screening putative lncRNA transcripts over large multimodal datasets. It provides a fully automated easy-to-run pipeline which performs data download, RNA-seq alignment, assembly, quality assessment, transcript filtration, novel lncRNA identification, coding potential estimation, expression level quantification, histone mark enrichment profile integration, differential expression analysis, annotation with other type of segmented data (CNVs, SNPs, Hi-C, etc.) and visualization. Importantly, lncRNA-screen generates an interactive report summarizing all interesting lncRNA features including genome browser snapshots and lncRNA-mRNA interactions based on Hi-C data. | ||
|
||
#### Setting up: | ||
|
||
1. lncRNA-screen has a upstream pipeline which taking care of the standard RNA-Seq analysis. Please follow the instruction from the link below to setup your RNA-Seq analysis pipeline and finish the alignment and assembly step. | ||
[RNA-Seq Standard](https://github.com/NYU-BFX/RNA-Seq_Standard). Don't forget to check out the RNA-Seq_Standard HTML report for diagnosis. | ||
* Attention: lncRNA-screen inhernt the "code" and "referenceFiles" directory from RNA-Seq_Standard pipeline. Please make sure you set up RNA-Seq_Standard pipeline folder correctly. | ||
* If you are using your own RNA-Seq analysis result, | ||
+ You still need to set up "RNA-Seq_Stanrad" pipeline in order to make "code" and "referenceFiles" accessible for lncRNA-screen. | ||
+ You need to put all the alignment files (one directory per sample) into [inputs/RNA-Seq/pipeline/alignment](inputs/RNA-Seq/pipeline/alignment/) directory. In each sample directory, you need to have a "accepted_hits.bam" file, a "sequencing_info.txt" file (first line contains unstranded/firststrand/secondstrand, second line contains single-end/pair-end) and a ".bw" file which names in sample name. | ||
+ You need to put all the cufflinks generated files (one directory per sample) into [inputs/RNA-Seq/pipeline/cufflinks](inputs/RNA-Seq/pipeline/cufflinks/) directory. In each sample directory, you need to have a "transcripts.gtf" file. | ||
+ You won't be able to get the alignment statistics barplot or featureCounts statistics barplot. | ||
|
||
2. Setting up [inputs](inputs/) directory: | ||
* a. Please replace the [inputs/RNA-Seq](inputs/RNA-Seq) link with the directory of your RNA-Seq_Standard you set up in step 1. | ||
* b. Please replace the [inputs/ChIP-Seq](inputs/ChIP-Seq) link with your ChIP-Seq analysis result. Make sure you have all the peak files in bed format in your [inputs/ChIP-Seq/bed](inputs/ChIP-Seq/bed) directory. (You can choose to use our ChIP-Seq analysis pipeline if you want from [this link](https://github.com/NYU-BFX/hic-bench/tree/master/pipelines/chipseq-standard).) | ||
* c. Please follow the example in [inputs/group_info.txt](inputs/group_info.txt) file to set up your sample sheet. Please make sure you match your sample name with the sample name you used in RNA-Seq_Standard pipeline (Sample name should match the directory name in the RNA-Seq/pipeline/alignment/ folder and RNA-Seq/pipeline/cufflinks). | ||
* d. Please change the parameters as your desire in [inputs/params.bash](inputs/params.bash) file. | ||
* e. [inputs/custom-bashrc](inputs/custom-bashrc) file set up the path to all necessary dependencies. If you are not using "Software Environment Management", you need to set up your path to r/3.3.0, python/2.7 and samtools/0.1.19. | ||
* f. [inputs/job_submitter.bash](inputs/job_submitter.bash) is a job submitting tool for SGE system. If you are using another job scheduler, please change it as your desire. But pleas make sure this script take the first parameter as number of thread, and the remaining parameters as the exact command you want to excute at real time in command line. | ||
|
||
3. Setting up [referenceFiles](referenceFiles/) directory: | ||
* a. By default, lncRNA-screen supports hg19 reference genome and use Gencode.v19 as reference annotation. Please run the following command to download and build the reference folder. | ||
``` | ||
code/build_referenceFiles.bash | ||
``` | ||
* b. If you are working on a different reference version or species, please check if you have replaced all the files in this directory as your own desire. Also please make sure you use the same reference version or species in [inputs/RNA-Seq/referenceFiles](inputs/RNA-Seq/referenceFiles) | ||
* c. Please don't forget to change your [inputs/params.bash](inputs/params.bash) file if you use a different reference version or species and finished setting up the [referenceFiles](referenceFiles/) directory. | ||
|
||
|
||
#### How to run it: | ||
|
||
Once everything has been set up, you can run the pipeline in two stages: | ||
|
||
1. To finish the analysis, please submit your jobs using the following command if you are using SGE: | ||
|
||
``` | ||
qsub -b Y -cwd -pe threaded 1 ./run.bash inputs/params.bash inputs/group_info.txt | ||
``` | ||
* If you are using another job scheduler, you can submit the follwoing command using your own job scheduler: | ||
|
||
``` | ||
./run.bash inputs/params.bash inputs/group_info.txt | ||
``` | ||
2. To generate lncRNA snapshots, please run the following command if you are using SGE. | ||
|
||
``` | ||
cd pipeline/snapshot | ||
qsub -b Y -cwd -pe threaded 1 code/batch_lncRNA_snapshot.bash 100 ../coding_potential/lncRNA.bed | ||
``` | ||
* The "100" is the number of jobs you want to run simultaneously. | ||
|
||
* If you are using another job scheduler, please go to [pipeline/snapshot](pipeline/snapshot) directory and submit the follwoing command using your own job scheduler: | ||
|
||
``` | ||
code/batch_lncRNA_snapshot.bash 100 ../coding_potential/lncRNA.bed | ||
``` | ||
|
||
3. The summary html report will be generated in [pipeline/report](pipeline/report) folder. | ||
|
||
|
||
#### Example of the html report: | ||
[Roadmap Epigenomics lncRNA Report](https://nyuepi:[email protected]/results/aifantislab/epigenome/demo/lncRNA-screen/RoadmapEpigenomics/lncRNA_report.html) | ||
|
||
[hESC lncRNA Report](https://nyuepi:[email protected]/results/aifantislab/epigenome/demo/lncRNA-screen/hESC/lncRNA_report.html) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/ChIP-Seq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/RNA-Seq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/cuffmerge |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/identify |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/whole_assembly |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/coding_potential |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/featureCounts |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/annotate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/summarize |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/fpkm_cutoff |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/peak_overlap |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/integrate_DE |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/integrate_HistoneCombine |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/pie_matrix_DE |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/pie_matrix_peak |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/figures |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/snapshot |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/tracks |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pipeline/report |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/bin/bash | ||
|
||
source code/custom-bashrc | ||
|
||
if [ "$#" -ne 2 ]; then | ||
printf "\n\n###### Usage\n\n" | ||
printf "$0 <PATH to group_info.txt file> <GROUP_BY>\n\n" | ||
exit | ||
fi | ||
|
||
GROUP_FILE=$1 | ||
GROUP_BY=$2 | ||
code/DESeq2.r ../whole_assembly/all.gtf ../whole_assembly/all.id2name.txt $GROUP_FILE $GROUP_BY ../featureCounts/raw_count.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
code.rnaseq/DESeq2.r |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
#!/local/apps/R/3.2.0/bin/Rscript | ||
|
||
usage = "\ | ||
Rscript FlowChart.r [flowchart.txt]" | ||
|
||
filter_step_oneside_toright <- function(x,y,a,b,c){ | ||
row=x;col=y;criteria=a;accepted=b;unaccepted=c; | ||
straightarrow (from = c(elpos[(row-1)*nrow+col,1],elpos[(row-1)*nrow+col,2]-0.1/nsteps), to = elpos[row*nrow+col,],lwd = 2, arr.pos = 0.72, arr.length = 0.6/nsteps) | ||
splitarrow(from = elpos[row*nrow+col, ], to = elpos[((row+1)*nrow+col):((row+1)*nrow+col+1), ], lwd = 2, centre = c(elpos[row*nrow+col,1], elpos[(row+1)*nrow+col,2] + 0.25/nsteps), arr.side = 2, arr.length = 0.6/nsteps) | ||
text(elpos[(row+1)*nrow+col, 1] - 0.1/nsteps, elpos[(row+1)*nrow+col+1, 2] + 0.2/nsteps, "yes", cex = 4/nsteps) | ||
text(elpos[(row+1)*nrow+col+1, 1] - 0.9/nsteps, elpos[(row+1)*nrow+col+1, 2] + 0.2/nsteps, "no", cex = 4/nsteps) | ||
texthexa(elpos[row*nrow+col,],radx=0.032*nchar(as.character(criteria))/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = criteria, box.col = "lightblue",shadow.col = "cadetblue4", shadow.size = 0.0025, cex = 4/nsteps) | ||
textellipse(elpos[(row+1)*nrow+col,],radx=0.6/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = accepted, box.col = "green",shadow.col = "darkgreen",lcol="black",lwd=1, shadow.size = 0.0025, cex = 4/nsteps) | ||
textrect(elpos[(row+1)*nrow+col+1,], 0.2/nsteps, 0.1/nsteps, lab = unaccepted, box.col = "indianred1",shadow.col = "indianred3", shadow.size = 0.0025, cex = 4/nsteps) | ||
} | ||
|
||
filter_step_oneside_toleft <- function(x,y,a,b,c){ | ||
row=x;col=y;criteria=a;accepted=b;unaccepted=c; | ||
straightarrow (from = c(elpos[(row-1)*nrow+col,1],elpos[(row-1)*nrow+col,2]-0.1/nsteps), to = elpos[row*nrow+col,],lwd = 2, arr.pos = 0.72, arr.length = 0.6/nsteps) | ||
splitarrow(from = elpos[row*nrow+col, ], to = elpos[((row+1)*nrow+col):((row+1)*nrow+col-1), ], lwd = 2, centre = c(elpos[row*nrow+col,1], elpos[(row+1)*nrow+col,2] + 0.25/nsteps), arr.side = 2, arr.length = 0.6/nsteps) | ||
text(elpos[(row+1)*nrow+col, 1] + 0.1/nsteps, elpos[(row+1)*nrow+col+1, 2] + 0.2/nsteps, "yes", cex = 4/nsteps) | ||
text(elpos[(row+1)*nrow+col, 1] - 0.9/nsteps, elpos[(row+1)*nrow+col+1, 2] + 0.2/nsteps, "no", cex = 4/nsteps) | ||
texthexa(elpos[row*nrow+col,],radx=0.032*nchar(as.character(criteria))/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = criteria, box.col = "lightblue",shadow.col = "cadetblue4", shadow.size = 0.0025, cex = 4/nsteps) | ||
textellipse(elpos[(row+1)*nrow+col,],radx=0.6/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = accepted, box.col = "green",shadow.col = "darkgreen",lcol="black",lwd=1, shadow.size = 0.0025, cex = 4/nsteps) | ||
textrect(elpos[(row+1)*nrow+col-1,], 0.2/nsteps, 0.1/nsteps, lab = unaccepted, box.col = "indianred1",shadow.col = "indianred3", shadow.size = 0.0025, cex = 4/nsteps) | ||
} | ||
|
||
filter_step_twoside_toright <- function(x,y,a,b,c){ | ||
row=x;col=y;criteria=a;meet1=unlist(strsplit(as.character(b),":"))[1];accepted1=unlist(strsplit(as.character(b),":"))[2];meet2=unlist(strsplit(as.character(c),":"))[1];accepted2=unlist(strsplit(as.character(c),":"))[2] | ||
straightarrow (from = c(elpos[(row-1)*nrow+col,1],elpos[(row-1)*nrow+col,2]-0.1/nsteps), to = elpos[row*nrow+col,],lwd = 2, arr.pos = 0.72, arr.length = 0.6/nsteps) | ||
splitarrow(from = elpos[row*nrow+col, ], to = elpos[((row+1)*nrow+col):((row+1)*nrow+col+1), ], lwd = 2, centre = c(elpos[row*nrow+col,1], elpos[(row+1)*nrow+col,2] + 0.25/nsteps), arr.side = 2, arr.length = 0.6/nsteps) | ||
text(elpos[(row+1)*nrow+col, 1] - 0.3/nsteps, elpos[(row+1)*nrow+col+1, 2] + 0.2/nsteps, meet1, cex = 4/nsteps) | ||
text(elpos[(row+1)*nrow+col+1, 1] - 0.9/nsteps, elpos[(row+1)*nrow+col+1, 2] + 0.2/nsteps, meet2, cex = 4/nsteps) | ||
texthexa(elpos[row*nrow+col,],radx=0.032*nchar(as.character(criteria))/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = criteria, box.col = "lightblue",shadow.col = "cadetblue4", shadow.size = 0.0025, cex = 4/nsteps) | ||
textellipse(elpos[(row+1)*nrow+col,],radx=0.6/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = accepted1, box.col = "green",shadow.col = "darkgreen",lcol="black",lwd=1, shadow.size = 0.0025, cex = 4/nsteps) | ||
textellipse(elpos[(row+1)*nrow+col+1,],radx=0.6/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = accepted2, box.col = "green",shadow.col = "darkgreen",lcol="black",lwd=1, shadow.size = 0.0025, cex = 4/nsteps) | ||
} | ||
|
||
library(diagram) | ||
args <- commandArgs(trailingOnly = TRUE) | ||
if (length(args)!=1) { write(usage,stderr()); quit(save='no'); } | ||
mydata=read.table(args[1],header = F,sep = "\t") | ||
|
||
pdf(paste0(gsub("\\.[^\\.]+$","",basename(args[1])),".pdf"),onefile = F) | ||
par(mar = c(0, 0, 0, 0)) | ||
openplotmat() | ||
nsteps=8 | ||
nrow=4 | ||
elpos <- coordinates (rep(nrow,(nsteps*2+1))) | ||
|
||
row=1;col=2; | ||
textellipse(elpos[(row-1)*nrow+col,],radx=0.6/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = as.numeric(as.character(mydata[1,2]))+as.numeric(as.character(mydata[1,3])), box.col = "green",shadow.col = "darkgreen",lcol="black",lwd=1, shadow.size = 0.0025, cex = 4/nsteps) | ||
|
||
filter_step_oneside_toright(1,2,mydata[1,1],mydata[1,2],mydata[1,3]) | ||
filter_step_oneside_toright(3,2,mydata[2,1],mydata[2,2],mydata[2,3]) | ||
filter_step_oneside_toright(5,2,mydata[3,1],mydata[3,2],mydata[3,3]) | ||
filter_step_oneside_toright(7,2,mydata[4,1],mydata[4,2],mydata[4,3]) | ||
filter_step_twoside_toright(9,2,mydata[5,1],mydata[5,2],mydata[5,3]) | ||
filter_step_oneside_toleft(11,2,mydata[6,1],mydata[6,2],mydata[6,3]) | ||
filter_step_oneside_toright(11,3,mydata[7,1],mydata[7,2],mydata[7,3]) | ||
filter_step_oneside_toleft(13,2,mydata[8,1],mydata[8,2],mydata[8,3]) | ||
filter_step_oneside_toright(13,3,mydata[9,1],mydata[9,2],mydata[9,3]) | ||
filter_step_oneside_toleft(15,2,mydata[10,1],mydata[10,2],mydata[10,3]) | ||
filter_step_oneside_toright(15,3,mydata[11,1],mydata[11,2],mydata[11,3]) | ||
|
||
dev.off() | ||
|
||
|
||
png(paste0(gsub("\\.[^\\.]+$","",basename(args[1])),".png")) | ||
par(mar = c(0, 0, 0, 0)) | ||
openplotmat() | ||
nsteps=8 | ||
nrow=4 | ||
elpos <- coordinates (rep(nrow,(nsteps*2+1))) | ||
|
||
row=1;col=2; | ||
textellipse(elpos[(row-1)*nrow+col,],radx=0.6/nsteps,rady=0.1/nsteps,adj=c(0.5,0.5), lab = as.numeric(as.character(mydata[1,2]))+as.numeric(as.character(mydata[1,3])), box.col = "green",shadow.col = "darkgreen",lcol="black",lwd=1, shadow.size = 0.0025, cex = 4/nsteps) | ||
|
||
filter_step_oneside_toright(1,2,mydata[1,1],mydata[1,2],mydata[1,3]) | ||
filter_step_oneside_toright(3,2,mydata[2,1],mydata[2,2],mydata[2,3]) | ||
filter_step_oneside_toright(5,2,mydata[3,1],mydata[3,2],mydata[3,3]) | ||
filter_step_oneside_toright(7,2,mydata[4,1],mydata[4,2],mydata[4,3]) | ||
filter_step_twoside_toright(9,2,mydata[5,1],mydata[5,2],mydata[5,3]) | ||
filter_step_oneside_toleft(11,2,mydata[6,1],mydata[6,2],mydata[6,3]) | ||
filter_step_oneside_toright(11,3,mydata[7,1],mydata[7,2],mydata[7,3]) | ||
filter_step_oneside_toleft(13,2,mydata[8,1],mydata[8,2],mydata[8,3]) | ||
filter_step_oneside_toright(13,3,mydata[9,1],mydata[9,2],mydata[9,3]) | ||
filter_step_oneside_toleft(15,2,mydata[10,1],mydata[10,2],mydata[10,3]) | ||
filter_step_oneside_toright(15,3,mydata[11,1],mydata[11,2],mydata[11,3]) | ||
|
||
dev.off() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
#!/usr/bin/env Rscript | ||
|
||
usage = "\ | ||
Rscript align_barplot.r [OUT DIR] [align_summary.txt] | ||
" | ||
|
||
align_reads <- function(x){ | ||
x = x %>% filter(V2!="Uniquely_Mapped") %>% mutate(V4=ifelse(grepl("NonDuplicated|Duplicated",V2, ignore.case=T),"Uniquely_Mapped",as.character(V2))) %>% arrange(desc(V1)) %>% mutate(V3=V3/1000000) | ||
x$V2 <- factor(x$V2, levels = c("NonDuplicated","Duplicated","Multi_Mapped","Unmapped")) | ||
x$V4 <- factor(x$V4, levels = c("Uniquely_Mapped","Multi_Mapped","Unmapped")) | ||
x=x[order(x$V1,x$V2),] | ||
myplot <- | ||
ggplot(x,aes(x=V1,y=V3,fill=V2)) + | ||
geom_bar(stat="identity",position="stack") + | ||
scale_fill_manual(values=c("limegreen","forestgreen","orchid","indianred1")) + | ||
ggtitle("B. Alignment Category Reads") + | ||
xlab("") + ylab("Number of Reads in Millions") + | ||
theme(axis.title = element_text(size=8),axis.text.y = element_blank(),axis.text.x = element_text(size=5),plot.title = element_text(size=8),legend.title=element_blank(),legend.text=element_text(size=8),axis.ticks.y=element_blank(), axis.line.y=element_blank()) + | ||
coord_flip() | ||
return(myplot) | ||
} | ||
|
||
align_PCT <- function(x){ | ||
x = x %>% filter(V2!="Uniquely_Mapped") %>% group_by(V1,add=F) %>% mutate(V3=V3/sum(V3)) %>% mutate(V4=ifelse(grepl("NonDuplicated|Duplicated",V2, ignore.case=T),"Uniquely_Mapped",as.character(V2))) %>% arrange(desc(V1)) %>% mutate(V3=V3*100) | ||
x$V2 <- factor(x$V2, levels = c("NonDuplicated","Duplicated","Multi_Mapped","Unmapped")) | ||
x$V4 <- factor(x$V4, levels = c("Uniquely_Mapped","Multi_Mapped","Unmapped")) | ||
x=x[order(x$V1,x$V2),] | ||
myplot <- ggplot(x,aes(x=V1,y=V3,fill=V2)) + | ||
geom_bar(stat="identity",position="stack") + | ||
ggtitle("A. Alignment Category Percentage") + | ||
scale_fill_manual(values=c("limegreen","forestgreen","orchid","indianred1")) + | ||
xlab("") + ylab("Percentage of Reads") + | ||
theme(axis.title = element_text(size=5),axis.text.y = element_text(size=2.8,angle=0),axis.text.x = element_text(size=5),plot.title = element_text(size=8)) + | ||
guides(fill=FALSE) + | ||
coord_flip() | ||
return(myplot) | ||
} | ||
|
||
for (p in c("dplyr","ggplot2","cowplot")){ | ||
if (!require(p,character.only=TRUE,quietly=TRUE,warn.conflicts=FALSE)) { | ||
if(!file.exists(Sys.getenv("R_LIBS_USER"))){system(paste0("mkdir",Sys.getenv("R_LIBS_USER")))} | ||
install.packages(p,lib=Sys.getenv("R_LIBS_USER"),repos="http://cran.rstudio.com/",verbose=F) | ||
} | ||
suppressPackageStartupMessages(library(p,character.only=TRUE,verbose=FALSE)) | ||
} | ||
args <- commandArgs(trailingOnly = TRUE) | ||
if (length(args)!=2) { write(usage,stderr()); quit(save='no'); } | ||
|
||
dir.create(args[1],recursive=T,showWarnings=F) | ||
pdf(paste0(args[1],"/",gsub("\\.[^\\.]+$","",basename(args[2])),".pdf"),width = 7,height=6,onefile = F) | ||
plot_grid(align_PCT(read.table(args[2])),align_reads(read.table(args[2]))) | ||
dev.off() | ||
png(filename=paste0(args[1],"/",gsub("\\.[^\\.]+$","",basename(args[2])),".png"),width = 7,height=6,units = "in",res=300) | ||
plot_grid(align_PCT(read.table(args[2])),align_reads(read.table(args[2]))) | ||
dev.off() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
code.rnaseq/align_summary |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
#!/bin/bash | ||
|
||
source code/custom-bashrc | ||
|
||
if [ "$#" -lt 2 ]; then | ||
printf "\n\n###### Usage\n\n" | ||
printf "bash run.bash <BED File> <all annotation bed files>\n\n" | ||
exit | ||
fi | ||
|
||
LNC_RNA=$1 | ||
shift | ||
|
||
other_annotations=($@) | ||
|
||
for file in ${other_annotations[@]} | ||
do | ||
|
||
BASE_NAME=`basename $file` | ||
PREFIX=${BASE_NAME%.*} | ||
|
||
echo -e ".ID_\t$PREFIX" > $PREFIX.txt | ||
|
||
cat $LNC_RNA | genomic_overlaps overlap -label $file | cut -f4 | replace_with_tab ':' | sort -k1,1 | mergeuniq -merge | tr ' ' '|' >> $PREFIX.txt | ||
|
||
done | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
#!/bin/bash | ||
|
||
if [ "$#" -ne 1 ]; then | ||
printf "\n\n###### Usage\n\n" | ||
printf "$0 \n\n" | ||
exit | ||
fi | ||
|
||
annotation="../1_RNA-Seq/?_annotation" | ||
cuffnorm="../1_RNA-Seq/?_cuffnorm/cuffnorm" | ||
compare=$1 | ||
|
||
if [[ ! -d comparison ]] | ||
then | ||
mkdir comparison | ||
fi | ||
|
||
while true; do | ||
echo | ||
echo "1. You need to delete the lines representing the samples you don't want to process." | ||
echo "2. You need to give Group Name for each samples" | ||
echo " Just change the second column to make it be your Group Name" | ||
echo | ||
read -p "Are you clear? [Y/N]" yn | ||
case $yn in | ||
[Yy]* ) | ||
if [[ ! -d comparison/$compare ]] | ||
then | ||
mkdir comparison/$compare | ||
fi | ||
if [[ ! -f comparison/$compare/1_group_info.tsv ]] | ||
then | ||
head -1 $annotation/?_lncRNA_fpkm.tsv | tr '\t' '\n' | skipn 1 | awk '{print $1"\t"$1}' > comparison/$compare/1_group_info.tsv | ||
fi | ||
vim comparison/$compare/1_group_info.tsv; break;; | ||
[Nn]* ) exit;; | ||
* ) echo "Please answer yes or no.";; | ||
esac | ||
done | ||
|
||
join -1 1 -2 2 <(sort -k1,1 comparison/$compare/1_group_info.tsv) <(more $cuffnorm/samples.table | tr '/q' '\t' | cut -f2,6 | replace_with_tab '_' | cut -f1,3 | skipn 1 | awk '{print $1+1"\t"$2}' | sort -k2,2) | awk -F ' ' '{print $2"\t"$3}' | sort -k1 | mergeuniq -merge | tr ' ' ',' > comparison/$compare/2_group+column.tsv | ||
|
Oops, something went wrong.