Skip to content

[bug] when loading mgf files without metadata key "scans" #249

@j-a-dietrich

Description

@j-a-dietrich

Hi,

This is part of a function in the fileloading.py module. I believe there is a bug when reading MGF files without the metadata key 'scans' because it does not use the same scan number for one spectrum. The running number is overwritten, which causes the problem. I added comments behind the responsible code lines.

def _load_data_mgf(input_filename):
file = load_from_mgf(input_filename)

ms2mz_list = []
for i, spectrum in enumerate(file): #A: scan number == i, here it is correctly defined
    if len(spectrum.peaks.mz) == 0:
        continue

    mz_list = list(spectrum.peaks.mz)
    i_list = list(spectrum.peaks.intensities)
    i_max = max(i_list)
    i_sum = sum(i_list)

    for i in range(len(mz_list)): #B: here the scan number i is overwritten
        if i_list[i] == 0:
            continue

        peak_dict = {}
        peak_dict["i"] = i_list[i]
        peak_dict["i_norm"] = i_list[i] / i_max
        peak_dict["i_tic_norm"] = i_list[i] / i_sum
        peak_dict["mz"] = mz_list[i]

        # Handling malformed mgf files
        try:
            peak_dict["scan"] = spectrum.metadata["scans"] # this works correctly because it uses the same spectrum object
        except:
            peak_dict["scan"] = i + 1 # here the scan number is assigned. But it is based on B and not on A how it should be I believe

....

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions