Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

singlem incorrectly parse sample name with paired-end data #151

Open
fplazaonate opened this issue Nov 20, 2023 · 5 comments
Open

singlem incorrectly parse sample name with paired-end data #151

fplazaonate opened this issue Nov 20, 2023 · 5 comments

Comments

@fplazaonate
Copy link

Hi @wwood ,

Many thanks for developing singlem. This is great tool that deserves more attention.

It seems singlem incorrectly parse sample name with paired-end data as it just removes the file extension:

class FastaNameToSampleName:

Could you fix this?

Best,
Florian

@wwood
Copy link
Owner

wwood commented Nov 20, 2023

Hi,

Thanks for kind words.

Can you be a bit more specific? You mean it doesn't remove the e.g. .1 or _1 bit?

@fplazaonate
Copy link
Author

Yes, that's it.
In the output file, the sample name is 'sample_1' instead of 'sample'

@wwood
Copy link
Owner

wwood commented Nov 20, 2023

Ah right. I made the decision not to wade into parsing the different possibilities there. Is there some general solution?

@fplazaonate
Copy link
Author

fplazaonate commented Nov 20, 2023

You can add an option where the user explicitly provides the sample name.
The alternative is to find a shared substring between the forward and reverse file.
EDIT: the first option is probably the best as the user may provide several fastq files from different sequencing runs.

@adityabandla
Copy link

Looking for the same feature as I have samples sequenced across multiple runs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants