additional support for ms2rescore refactoring #9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces parallel processing for spectrum parsing and feature extraction, adds new MS2 feature calculation capabilities, and improves Python interoperability for the
MS2Spectrumstruct. The changes are primarily focused on performance improvements and new functionality for feature extraction from MS2 spectra.Parallelization and Performance Improvements:
rayoncrate and enabled theparallelismfeature formzdatato allow parallel processing of spectra in bothparse_mzdata.rsandparse_timsrust.rs, resulting in faster reading and parsing of large spectrum files. [1] [2] [3] [4] [5] [6] [7] [8] [9]New Feature Extraction Functionality:
ms2_features.rswith thebatch_ms2_features_from_spectrafunction, which computes a set of features (including intensity-based and sequence-based metrics, as well as an optional hyperscore) for batches of MS2 spectra in parallel. [1] [2] [3]ms2pip_features.rswith thebatch_ms2pip_features_numpywhich computes a set of features for the ms2pip feature generator in MS2rescorePython Interoperability and API Improvements:
MS2Spectrumstruct with a Python constructor (__new__) and pickling support (__reduce__), making it easier to create and serialize/deserialize spectrum objects from Python. [1] [2]get_precursor_infoandget_ms2_spectrato accept the Python interpreter context and allow thread release for improved performance with Python bindings. [1] [2]Dependency and Version Updates:
0.5.0and added new dependencies (rayon,rustyms,ordered-float,numpy) to support new features and parallelization. [1] [2]API Additions:
These changes collectively improve the performance, usability, and feature set of the
ms2rescore-rslibrary, especially for batch processing and integration with Python workflows.