diff --git a/docs/data_conversion.md b/docs/data_conversion.md index 542535e1..aeb71c15 100644 --- a/docs/data_conversion.md +++ b/docs/data_conversion.md @@ -1,27 +1,66 @@ # Data handling (supported formats) -mzmine supports both **open** (e.g., .mzML, .mzXML, .imzML, .netCDF) and **proprietary** -formats from Bruker Daltonics (.d and .tdf/tsf). Raw data files from -vendors must be converted into an open format prior to the import. **This conversion can be applied automatically -during the import, if the user has MSConvert installed.** -If you want to convert the files yourself, see the sections below. - -The **recommendations** for the data handling are the conversion of the raw data to centroided .mzML -data files, -**except** for timsTOF data (native .tdf and .tsf inside the Bruker .d folder), and the conversion of MS -imaging data to .imzML, except for the timsTOF fleX MS imaging data. +mzmine supports both **open** (e.g., .mzML, .mzXML, .imzML, .netCDF) and many **proprietary** +vendor data formats. For many vendor formats it is recommended to use the original data files and +to only apply conversion where needed. Data files can be imported by drag-and-dropping files into +the mzmine graphical user interface or using the _import MS data_ module. + +**Supported raw data formats include:** +- Bruker Daltonics (.d and .tdf/tsf) +- Thermo Fisher (.raw) +- Waters (.raw folders) +- Agilent (.d) +- Sciex (.wiff/.wiff2) +- Shimadzu (.lcd) +- MOBILion (.mbi) +- mzML or mzXML/mzData/netCDF (prefer mzML if possible due to better metadata coverage) +- imzML (MS imaging) + +!!! warning + + Some vendor data formats are only supported on specific operating systems due to the limited + support by their respective data access libraries. All data formats are supported on Windows + and many on Linux (see full [compatibility list here](system_requirements.md#compatibility)). + Many data formats are unsupported on macOS, requiring data conversion to open formats, usually + on a Windows or Linux computer. + +## External dependencies + +Many data formats are supported without external dependencies, directly through the mzmine version. +Other formats may require another third-party software to be downloaded and installed. + +**MSConvert:** This tool is provided by ProteoWizard and the default data converter for MS data. +While our team is expanding the native data support for all major vendor formats, we recommend to +install MSConvert for some formats. This will grant direct access to these files and mzmine will +use MSConvert with some internal optimizations to load data files in the background. Formats that +currently require MSConvert for direct support include: +- Agilent (.d) +- Sciex (.wiff/.wiff2) +- Shimadzu (.lcd) +- MOBILion (.mbi) ## Data conversion to open formats (.mzML / .imzML) +When converting data, prefer the latest standard formats mzML (or imzML for MS imaging data). +Other older open formats like mzData or mzXML may cover less metadata. +It is **recommended** to convert raw data to centroided .mzML files. +**Exceptions:** timsTOF native data as .tdf and .tsf inside the Bruker .d folder are best imported +in their original format. This is also true for timsTOF fleX MS imaging data. + +**This conversion can be applied automatically during the import if the user has MSConvert installed.** +If you want to convert the files yourself, see the sections below. + + ### MSConvert (ProteoWizard) to mzML !!! info - mzmine can use MSConvert automatically. Make sure to setup the MSConvert installation path in the mzmine preferences. (only supported on Windows) + mzmine can use MSConvert automatically. Make sure to setup the MSConvert installation path in + the mzmine preferences. (only supported on Windows) ![MSConvert_settings](MSConvert_settings.png) -MSConvert supports the conversion of AB SCIEX, Agilent, Bruker, Shimadzu, Thermo Scientific, +MSConvert supports the conversion of AB SCIEX, Agilent, Bruker, Shimadzu, Thermo Scientific, MOBILion, and [Waters](data_conversion.md#waters) raw data. More information about the formats can be found in the [ProteoWizard Documentation for Users](https://proteowizard.sourceforge.io/doc_users.html). Furthermore, profile data can be centroided to reduce the file size and memory consumption, @@ -62,17 +101,19 @@ the [ProteoWizard documentation](https://proteowizard.sourceforge.io/tools/mscon ### ThermoRawFileParser It is used to convert ThermoFisher .raw files into .mgf, .mzML, .parquet. This converter is -important if an -internal calibrant was used (e.g., EASY-IC). This mass is excluded in the FreeStyle view, whereas -MSConvert -remains all signals in the mzML, including the calibrant. If those masses together with some flagged signals -by Thermo, should be -removed use this converter with the option --excludeExceptionData. +important if an internal calibrant was used (e.g., EASY-IC). This mass is excluded in the FreeStyle +view, whereas MSConvert remains all signals in the mzML, including the calibrant. If those masses +together with some flagged signals by Thermo, should be removed use this converter with the option +**--excludeExceptionData**. !!! Note - mzmine can use the ThermoRawFileParser automatically to import your data without conversion. In the preferences (CTRL+P) - set the "Thermo data import" to "Thermo raw file parser" instead of MSConvert. The raw file parser is supported on Mac, Linux, and Windows. + **mzmine 4.8** and higher supports Thermo Raw data directly, there is no need to install external + dependencies. + + **Earlier mzmine versions** can use the ThermoRawFileParser automatically to import your data + without conversion. In the preferences (CTRL+P) set the "Thermo data import" to "Thermo raw + file parser" instead of MSConvert. The raw file parser is supported on Mac, Linux, and Windows. Example for command line interface with the exclusion of exception data: @@ -105,6 +146,8 @@ how to use it can be found [here](https://github.com/elnurgar/mzxml-precursor-co ### Waters +Direct Waters data support is currently in beta phase. + Waters recently released a tool called **Waters data connect**, which allows conversion of DDA, DIA, and HD-DDA data to mzML. Lock mass correction is applied during the conversion. We also recommend to enable centroiding (2D peak picking). diff --git a/docs/system_requirements.md b/docs/system_requirements.md index ca9b783b..dd6e3bc8 100644 --- a/docs/system_requirements.md +++ b/docs/system_requirements.md @@ -4,7 +4,7 @@ Installation of mzmine is described on the [getting started](getting_started.md#install-update) page. mzmine is available as an installable package or a portable version. The portable version does not -require administator rights to be run, making it useful for university students without elevated +require administrator rights to be run, making it useful for users without elevated permissions. ## Hardware requirements @@ -38,16 +38,12 @@ permissions. -- Up-to-date operating system, e.g., Windows 10 or newer, recent Linux or MacOS (academic only) versions -- mzmine does not require a dedicated Java installation, even though it is a Java software. All -requirements are shipped with mzmine +- Up-to-date operating system, e.g., Windows 10 or newer, recent Linux or MacOS (academic only) versions. +- mzmine does not require a dedicated Java installation, as it is a self-contained Java software with its own Java Virtual Machine. All +requirements are shipped with mzmine. - Microsoft Visual Studio C++ Redist for Bruker raw data import [download page](https://learn.microsoft.com/de-de/cpp/windows/latest-supported-vc-redist?view=msvc-170) -- MSConvert (on Windows) for native Sciex, Waters, Shimadzu, MOBILion, Thermo data - support [download page](https://proteowizard.sourceforge.io/download.html) - - Thermo alternative: ThermoRawFileParser for native Thermo support on Windows, Mac, and - Linux [download page](https://github.com/pluskal-lab/ThermoRawFileParserMacLinux/releases) - - ThermoRawFileParser does not need to be installed but only downloaded and imported via the - mzmine preferences +- MSConvert (on Windows) for native Agilent, Sciex, Waters, Shimadzu, and MOBILion data support [download page](https://proteowizard.sourceforge.io/download.html) + ## Internet connection @@ -65,3 +61,32 @@ for spectral networking using MS2Deepscore and DReaMS, an internet connection is - https://zenodo.org/ spectral libraries - https://external.gnps2.org/gnpslibrary spectral libraries + +## Operating system compatibility {#compatibility} + +### Windows + +Currently, all modules are compatible with Microsoft Windows 10 and higher. + +Some libraries for the raw data support for vendor-specific formats are only available for Windows. +Read more about data support and [data conversion](data_conversion.md). + +### Linux + +Some libraries for the raw data support for vendor-specific formats are only available for Windows. + +The Linux version **supports** raw data formats from: +- **Thermo**, **Bruker**, **Waters** + +Data from other Vendors may need to be **converted** to the open .mzML format before, including: +- **Agilent**, **Sciex**, **Shimadzu**, **MOBILion** + +### macOS + +Some libraries for the raw data support for vendor-specific formats are only available for Windows and Linux. + +The macOS version **supports** raw data formats from: +- Thermo + +Data from other Vendors may need to be **converted** to the open .mzML format before, including: +- Agilent, Sciex, Shimadzu, MOBILion, Bruker, Waters