Unable to set PATH for data #115

thulasis · 2022-11-29T22:44:38Z

Hi
I am trying to install singlem in our local system with conda environment. I am having two issues

with singlem bin
conda activate singlem
cd bin
export PATH=$PWD:$PATH
singlem -h
It works well and displays scripts options

but after deactivating conda environment and re-activating conda environment
the singlem -h displays
"singlem command not found

with data
after
singlem data --output-directory /tmp/dbs
then adding path to ~/.bashrc
export SINGLEM_METAPACKAGE_PATH='/tmp/dbs/S3.0.5.metapackage20220806.smpkg.zb/payload_directory'

when I am running
singlem data --verify-only

The following error pops up
11/29/2022 04:32:07 PM INFO: SingleM v1.0.0beta2
11/29/2022 04:32:07 PM INFO: Acquiring SingleM packages from environment variable
11/29/2022 04:32:07 PM INFO: Retrieval successful. Location of backpack is: /tmp/dbs/S3.0.5.metapackage20220806.smpkg.zb/payload_directory
Traceback (most recent call last):
File "/home/swmed.org/s212810/miniconda3/envs/singlem/bin/singlem", line 1084, in
Metapackage.verify(output_directory = args.output_directory)
File "/home/swmed.org/s212810/miniconda3/envs/singlem/bin/../singlem/metapackage.py", line 133, in verify
backpack = zenodo_backpack.acquire(env_var_name=DATA_ENVIRONMENT_VARIABLE, version=DATA_DEFAULT_VERSION)
File "/home/swmed.org/s212810/miniconda3/envs/singlem/lib/python3.9/site-packages/zenodo_backpack/init.py", line 121, in acquire
if version != zb.data_version_string():
File "/home/swmed.org/s212810/miniconda3/envs/singlem/lib/python3.9/site-packages/zenodo_backpack/init.py", line 71, in data_version_string
return self.contents[DATA_VERSION]
KeyError: 'data_version'

Please help me with this

Thanks,
Tulasi

wwood · 2022-11-30T01:07:41Z

Hi,

Thanks for giving it a go.

For (1) PATH is not set by conda activate (unless you e.g. create a file in etc/conda/activate.d of the conda env directory which does this). This issue won't remain in the future when there is a proper bioconda package.

For (2) this bug is fixed in the main branch and 1.0.0beta3, which I just pushed. Basically you need to specify

export SINGLEM_METAPACKAGE_PATH='/tmp/dbs/S3.0.5.metapackage20220806.smpkg.zb

i.e. don't have the /payload_directory bit. You don't need to redownload the data, the previous version was just telling you the wrong export of SINGLEM_METAPACKAGE_PATH.

HTH, ben

thulasis · 2022-11-30T17:08:01Z

Hi Ben,

I fixed the 1) problem manually by transferring files to conda environment and it was worked.
And the second one, it is working now after the fix.

BTW I am running this on nanopore reads. I am using the " singlem pipe --sequences" option. I guess I am doing right. If not please let me know.

Thanks,
Tulasi

wwood · 2022-11-30T23:37:17Z

Glad the first 2 issues went away.

I've not really tested singlem on nanopore datasets, so your mileage may vary. I suspect it might work OK but keen to see if e.g. the profiles from nanopore roughly match the profiles from Illumina sequencing. There's certainly a number of improvements that I can think of that might be suitable.

One thing you might want to try is pipe --hmmsearch-package-assignment. Nanopore reads will be long enough to break the default assumption that a read only encodes the window from at most 1 gene - that flag removes that assumption at a small cost to runtime.

Let me know how you go, if you don't mind?

Thanks,
ben

thulasis · 2022-12-01T18:28:04Z

Hi Ben,

Thanks for the suggestions. I already tested these reads on EPI2ME. You are correct the mileage is low. I tried pipe --hmsearch-package-assignment but not much improvement in the classification of OTUs.

I am getting these messages on screen after running the option hmmsearch

12/01/2022 11:32:37 AM INFO: SingleM v1.0.0beta2
12/01/2022 11:32:37 AM INFO: Retrieval successful. Location of backpack is: /tmp/dbs/S3.0.5.metapackage20220806.smpkg.zb
12/01/2022 11:32:37 AM INFO: Loaded 59 SingleM packages
12/01/2022 11:32:37 AM INFO: Using as input 1 different sequence files e.g. barcode01.fastq.gz
12/01/2022 11:32:37 AM INFO: Filtering sequence files through DIAMOND blastx
12/01/2022 11:55:01 AM INFO: Finished DIAMOND prefilter phase
12/01/2022 11:55:01 AM INFO: Assigning sequences to SingleM packages with HMMSEARCH ..
12/01/2022 11:55:01 AM INFO: Searching with 59 SingleM package(s)
12/01/2022 11:55:01 AM INFO: Searching for reads matching 77 different protein HMM(s)
12/01/2022 11:55:15 AM INFO: Finished search phase
12/01/2022 11:55:15 AM INFO: Running separate alignments in GraftM..
12/01/2022 11:56:48 AM INFO: Finished extracting aligned sequences
12/01/2022 11:56:48 AM INFO: Running taxonomic assignment ..
12/01/2022 11:56:48 AM INFO: Assigning taxonomy by singlem query ..

Do you did it change anything from the default run options?
Of course, the sequence data is not so cool, as it is filled with 99% of host DNA.

Thanks,
Tulasi

wwood · 2022-12-01T21:19:44Z

Hi,

That output looks right, though I cannot tell from it how many reads are being picked up. Of course, if there is near-zero microbial reads then community profiling isn't really possible. Thanks for keeping me up to date.

When you say

not much improvement in the classification of OTUs.

What does that mean? Not good taxonomic assignment or not good number of reads included?

Thanks, ben

thulasis · 2022-12-01T21:33:18Z

Hi Ben,

What I meant to say is from default settings I got the otu-table with 90 data points while with the hmmsearch option, I got 77 data points in the csv file.

The taxonomic assignment is same in both files and almost similar to Nanopre's EPI2ME pipeline, as the sample got 99.9% host DNA reads. It was expected.

Thanks,
Tulasi

wwood · 2022-12-02T04:47:16Z

Thanks for letting me know. I actually find those results a bit surprising. I would have thought you'd get more data points in the hmmsearch option.

Would you mind please running without the hmmsearch option but using --archive-otu-table and then sending me that file via email (perhaps zipped to save space) or otherwise please? Included in that will be the raw reads that ultimately go into the output CSV file, and will help me debug what seems to be going awry.

Thanks, ben

thulasis · 2022-12-02T16:49:28Z

Hi Ben,

Yeah sure. I am running it now. Here I attached the result file. Do you need me to send the initial raw reads as well?
We have 10 GB data for each barcode. If you like that also. I will share with you on our lamella cloud.

Thanks,
barcode01_archive.txt

Tulasi

wwood · 2022-12-02T22:04:33Z

Hi, Just the archives is fine I reckon. Thanks

…

________________________________ From: Tulasi_Rao ***@***.***> Sent: Saturday, December 3, 2022 2:49:38 AM To: wwood/singlem ***@***.***> Cc: Ben J Woodcroft ***@***.***>; Comment ***@***.***> Subject: Re: [wwood/singlem] Unable to set PATH for data (Issue #115) Hi Ben, Yeah sure. I am running it now. I will send you the archived files. Do you need me to send the initial raw reads as well? We have 10 GB data for each barcode. If you like that also. I will share with you on our lamella cloud. Thanks, Tulasi — Reply to this email directly, view it on GitHub<#115 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAADX5CK6H52EUFMMRS3RV3WLISCFANCNFSM6AAAAAASO7ZWXM>. You are receiving this because you commented.Message ID: ***@***.***>

thulasis · 2022-12-06T16:40:48Z

Thanks Ben

aljazdzy · 2023-06-20T17:40:15Z

I had a similar issue AND I'm also running Nanopore reads so I am very interested in any potential outcomes that may have been resolved here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to set PATH for data #115

Unable to set PATH for data #115

thulasis commented Nov 29, 2022

wwood commented Nov 30, 2022

thulasis commented Nov 30, 2022

wwood commented Nov 30, 2022

thulasis commented Dec 1, 2022

wwood commented Dec 1, 2022

thulasis commented Dec 1, 2022

wwood commented Dec 2, 2022

thulasis commented Dec 2, 2022 •

edited

Loading

wwood commented Dec 2, 2022 via email

thulasis commented Dec 6, 2022

aljazdzy commented Jun 20, 2023

Unable to set PATH for data #115

Unable to set PATH for data #115

Comments

thulasis commented Nov 29, 2022

wwood commented Nov 30, 2022

thulasis commented Nov 30, 2022

wwood commented Nov 30, 2022

thulasis commented Dec 1, 2022

wwood commented Dec 1, 2022

thulasis commented Dec 1, 2022

wwood commented Dec 2, 2022

thulasis commented Dec 2, 2022 • edited Loading

wwood commented Dec 2, 2022 via email

thulasis commented Dec 6, 2022

aljazdzy commented Jun 20, 2023

thulasis commented Dec 2, 2022 •

edited

Loading