Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Casanovo-DB potential performance improvements #402

Open
bittremieux opened this issue Nov 18, 2024 · 0 comments
Open

Casanovo-DB potential performance improvements #402

bittremieux opened this issue Nov 18, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@bittremieux
Copy link
Collaborator

  • Should give a warning when non-supported PTMs are used (e.g. C carbamidomethylation is not specified as a fixed modification) [easy]
  • Multithreading during FASTA digestion and m/z calculation [easy]
  • Refactor candidate selection to not search from scratch for every candidate but use some sort of looping index [moderate (within batches) — hard (across batches, would require modifications to the data loader)]
  • mzTab export should report protein database information [easy]
  • Spectrum progress bar should show more granular updates [moderate]
  • Superfluous predicted peptide m/z calculation (can be derived from database) [moderate]

Open-ended evaluations:

  • Profiling to understand where the runtime and memory consumption bottlenecks are—likely PSM batch creation contributes. Candidate retrieval can probably be optimized using a sliding window approach.
  • Investigate whether _calc_match_score can be harmonized between de novo and DB modes.
@bittremieux bittremieux added the enhancement New feature or request label Nov 18, 2024
@bittremieux bittremieux added this to the Casanovo v5.0.0 milestone Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant