Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding EGFP to ENCODE cDNA.fa #6

Closed
amitpande74 opened this issue Apr 27, 2021 · 3 comments
Closed

Adding EGFP to ENCODE cDNA.fa #6

amitpande74 opened this issue Apr 27, 2021 · 3 comments

Comments

@amitpande74
Copy link

HI,
What should be in file format for adding the GFP to the cDNA file ?
I am trying to add

>GFP
ATGCCCGCCATGAAGATCGAGTGCCGCATCACCGGCACCCTGAACGGCGTGGAGTTCGAGCTGGTGGGCGGCGGAGAGGGCACCCCCGAGCAGGGCCGCATGACCAACAAGATGAA
GAGCACCAAAGGCGCCCTGACCTTCAGCCCCTACCTGCTGAGCCACGTGATGGGCTACGGCTTCTACCACTTCGGCACCTACCCCAGCGGCTACGAGAACCCCTTCCTGCACGCCA
TCAACAACGGCGGCTACACCAACACCCGCATCGAGAAGTACGAGGACGGCGGCGTGCTGCACGTGAGCTTCAGCTACCGCTACGAGGCCGGCCGCGTGATCGGCGACTTCAAGGTG
GTGGGCACCGGCTTCCCCGAGGACAGCGTGATCTTCACCGACAAGATCATCCGCAGCAACGCCACCGTGGAGCACCTGCACCCCATGGGCGATAACGTGCTGGTGGGCAGCTTCGC
CCGCACCTTCAGCCTGCGCGACGGCGGCTACTACAGCTTCGTGGTGGACAGCCACATGCACTTCAAGAGCGCCATCCACCCCAGCATCCTGCAGAACGGGGGCCCCATGTTCGCCT
TCCGCCGCGTGGAGGAGCTGCACAGCAACACCGAGCTGGGCATCGTGGAGTACCAGCACGCCTTCAAGACCCCCATCGCCTTCGCC

and what about the decoys ?
Kindly help. Since here something else is mentioned
https://combine-lab.github.io/alevin-tutorial/2019/selective-alignment/
regards,
Amit.

@igordot
Copy link
Owner

igordot commented Apr 27, 2021

I haven't tried it myself, but that should work.

Salmon docs includes a section on how to build a decoy-aware transcriptome file.

@igordot igordot closed this as completed Apr 27, 2021
@amitpande74
Copy link
Author

Hi,

./generateDecoyTranscriptome.sh [-j =1 default] [-b =bedtools default] [-m =mashmap default] -a -g -t -o

When I ran the analysis:

./generateDecoyTranscriptome.sh -b /usr/local/bin/bedtools -m /home/amit/miniconda3/bin/mashmap -a ../hg19.ncbiRefseq.added.gtf -g ../hg19_added_genes.fa -t ../Homo_sapiens.GRCh38.cdna.all.fa -o /home/amit/Downloads/SalmonTools-master/sb_results


*** getDecoy ***


-b = /usr/local/bin/bedtools
-m = /home/amit/miniconda3/bin/mashmap
-a = /home/amit/Downloads/SalmonTools-master/hg19.ncbiRefseq.added.gtf
-g = /home/amit/Downloads/SalmonTools-master/hg19_added_genes.fa
-t = /home/amit/Downloads/SalmonTools-master/Homo_sapiens.GRCh38.cdna.all.fa
-o = /home/amit/Downloads/SalmonTools-master/sb_results
[1/10] Extracting exonic features from the gtf
[2/10] Masking the genome fasta
[3/10] Aligning transcriptome to genome

Reference = [reference.masked.genome.fa]
Query = [/home/amit/Downloads/SalmonTools-master/Homo_sapiens.GRCh38.cdna.all.fa]
Kmer size = 16
Window size = 5
Segment length = 500 (read split allowed)
Alphabet = DNA
Percentage identity threshold = 80%
Mapping output file = mashmap.out
Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
Execution threads = 1

INFO, skch::Sketch::build, minimizers picked from reference = 937226701
./generateDecoyTranscriptome.sh: line 105: 14872 Segmentation fault (core dumped) $mashmap -r reference.masked.genome.fa -q $txpfile -t $threads --pi 80 -s 500


*** ABORTED ***


An error occurred. Exiting...

Kindly guide.

@igordot
Copy link
Owner

igordot commented Apr 28, 2021

I am not sure why you are getting an error. I think this would be a question for the Salmon developers. I would first try to run this on unmodified reference files just to make sure everything is working. Additionally, based on the file names, it looks like you are mixing hg19 and hg38 files which will definitely cause issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants