Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SingleM stuck running on specific SRAs #147

Open
AnneliektH opened this issue Nov 13, 2023 · 1 comment
Open

SingleM stuck running on specific SRAs #147

AnneliektH opened this issue Nov 13, 2023 · 1 comment

Comments

@AnneliektH
Copy link

Hi there,

I've been running SingleM on a set of SRA files. For most of them this is fast and takes < 10 minutes. Some SRAs, it seemingly runs forever and gets stuck. This is one example of such a file: ERR2205747. Using top, I find that the system does use CPU so I think it is doing something?
I run SingleM using the following command (only looking for a specific protein)

singlem pipe --sra-files sra/ERR2205747 --otu-table ERR2205747.csv
--singlem-packages path/to/payloaddirectory/S3.40.ribosomal_protein_L11_rplK.spkg
--no-assign-taxonomy --threads 8

@wwood
Copy link
Owner

wwood commented Nov 14, 2023

Hi,

Thanks for the report. I'm wondering whether there is something fishy about that sample as stored in the SRA. When you say the system uses CPU, is it singlem or kingfisher that is using the CPU?

A likely workaround for this is to download the data from ENA rather than SRA (you could use kingfisher directly for this) - the fastq format may help.

That specific sample is missing from the sandpiper database too, even though it seems like it should be (published 2017, metagenomic), but I don't have logs about what went awry still.

Let me know how you go. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants