![]() |
OpenMS
|
Spectrum preprocessing and theoretical-vs-experimental peak alignment helpers used by the OpenPepXL cross-link search engines. More...
#include <OpenMS/ANALYSIS/XLMS/OPXLSpectrumProcessingAlgorithms.h>
Static Public Member Functions | |
| static PeakSpectrum | mergeAnnotatedSpectra (PeakSpectrum &first_spectrum, PeakSpectrum &second_spectrum) |
| Merge two annotated spectra into one peak list, preserving paired DataArrays. | |
| static PeakMap | preprocessSpectra (PeakMap &exp, double fragment_mass_tolerance, bool fragment_mass_tolerance_unit_ppm, Size peptide_min_size, Int min_precursor_charge, Int max_precursor_charge, bool deisotope, bool labeled) |
| Preprocess an MSExperiment for cross-link search and return the surviving MS2 spectra. | |
| static void | getSpectrumAlignmentFastCharge (std::vector< std::pair< Size, Size > > &alignment, double fragment_mass_tolerance, bool fragment_mass_tolerance_unit_ppm, const PeakSpectrum &theo_spectrum, const PeakSpectrum &exp_spectrum, const DataArrays::IntegerDataArray &theo_charges, const DataArrays::IntegerDataArray &exp_charges, DataArrays::FloatDataArray &ppm_error_array, double intensity_cutoff=0.0) |
| Align a theoretical and an experimental fragment spectrum using charge annotations and an intensity-ratio cut-off. | |
| static void | getSpectrumAlignmentSimple (std::vector< std::pair< Size, Size > > &alignment, double fragment_mass_tolerance, bool fragment_mass_tolerance_unit_ppm, const std::vector< SimpleTSGXLMS::SimplePeak > &theo_spectrum, const PeakSpectrum &exp_spectrum, const DataArrays::IntegerDataArray &exp_charges) |
| Align a SimplePeak-based theoretical spectrum to an experimental spectrum using charge annotations only. | |
Spectrum preprocessing and theoretical-vs-experimental peak alignment helpers used by the OpenPepXL cross-link search engines.
Static utilities (the class carries no state) that the cross-link identification workflows in OpenPepXL build on:
|
static |
Align a theoretical and an experimental fragment spectrum using charge annotations and an intensity-ratio cut-off.
For each theoretical peak, the closest experimental peak inside the mass-tolerance window is picked, restricted to peaks whose charge and intensity also pass the per-peak filters.
Tolerance window half-width: theo_mz * fragment_mass_tolerance * 1e-6 when fragment_mass_tolerance_unit_ppm is true, else fragment_mass_tolerance Da.
Charge filter: a pair (theoretical charge tz, experimental charge ez) matches when tz == ez or either side is 0 (treated as "unknown"). If theo_charges or exp_charges is empty, the charge filter degrades to permissive.
Intensity filter: a pair (theoretical intensity ti, experimental intensity ei) matches when min(ti,ei) / max(ti,ei) > intensity_cutoff. Pass 0 to disable.
Both spectra must be sorted by m/z; alignment and ppm_error_array must be empty on entry (precondition). When either spectrum is empty, the function returns with no output.
| [out] | alignment | Receives (theo-index, exp-index) match pairs. Must be empty on entry. |
| [in] | fragment_mass_tolerance | Tolerance window half-width. |
| [in] | fragment_mass_tolerance_unit_ppm | Interpret fragment_mass_tolerance as ppm (true) or Da (false). |
| [in] | theo_spectrum | Theoretical spectrum (sorted by m/z). |
| [in] | exp_spectrum | Experimental spectrum (sorted by m/z). |
| [in] | theo_charges | Per-peak charges for theo_spectrum; an empty array disables the charge filter. |
| [in] | exp_charges | Per-peak charges for exp_spectrum; an empty array disables the charge filter. |
| [out] | ppm_error_array | Receives per-match ppm errors (exp_mz - theo_mz) / theo_mz * 1e6. Must be empty on entry. |
| [in] | intensity_cutoff | Minimum smaller-over-larger intensity ratio for a match; 0 disables the intensity filter. |
|
static |
Align a SimplePeak-based theoretical spectrum to an experimental spectrum using charge annotations only.
Mirror of getSpectrumAlignmentFastCharge but without the intensity-ratio filter and without ppm-error output, and with the theoretical side expressed as a vector<SimpleTSGXLMS::SimplePeak> (charges carried per peak inside the SimplePeak struct, not as a separate DataArray). alignment is cleared on entry — it does not need to start empty.
Charge filter rule is identical to the fast-charge variant: (theo_charge == exp_charge or either side is 0) matches. An empty exp_charges array makes the charge filter permissive.
| [out] | alignment | Receives (theo-index, exp-index) match pairs. Cleared on entry. |
| [in] | fragment_mass_tolerance | Tolerance window half-width. |
| [in] | fragment_mass_tolerance_unit_ppm | Interpret fragment_mass_tolerance as ppm (true) or Da (false). |
| [in] | theo_spectrum | Theoretical spectrum (vector of SimplePeak; per-peak m/z and charge). |
| [in] | exp_spectrum | Experimental spectrum (sorted by m/z). |
| [in] | exp_charges | Per-peak charges for exp_spectrum; an empty array disables the charge filter. |
|
static |
Merge two annotated spectra into one peak list, preserving paired DataArrays.
Peaks of first_spectrum and second_spectrum are concatenated and the result is sorted by m/z. For each kind of DataArray (Float / String / Integer) the i-th array of first_spectrum is paired with the i-th array of second_spectrum and their contents are concatenated; the output array inherits its name from first_spectrum's i-th array. Extra arrays present only in second_spectrum are dropped — pairing is positional, not by name.
Despite the non-const references in the signature, neither input is modified.
| [in,out] | first_spectrum | Spectrum whose DataArray names and ordering define the output. |
| [in,out] | second_spectrum | Spectrum whose peaks and (positionally paired) DataArrays are appended. |
|
static |
Preprocess an MSExperiment for cross-link search and return the surviving MS2 spectra.
Two phases — first the input exp is modified in place: zero-intensity peaks are removed (ThresholdMower), intensities are normalised, and the spectra are sorted by retention time. Then the MS2 spectra are iterated (OpenMP-parallelised loop) and those that pass the per-spectrum filters are copied into a freshly built PeakMap that is returned. MS1 spectra are not copied to the output.
For unlabeled data (labeled is false) a spectrum is retained only if it has a single precursor whose charge lies in [min_precursor_charge, max_precursor_charge] and at least 2 * peptide_min_size peaks. Such spectra are further reduced by a WindowMower with hardcoded settings (window size 100, keep 20 peaks per window, "jump" mode). The final peak count must again exceed 2 * peptide_min_size.
For labeled data (labeled is true) the precursor and peak-count filters are bypassed, the WindowMower is not applied, and every MS2 spectrum of the input is present in the output — keeping spectrum indices stable across the heavy/light pairing performed downstream via consensusXML.
When deisotope is true, each surviving spectrum is run through Deisotoper::deisotopeAndSingleCharge with a simple averagine model, charge range [1,7], isopeak counts in [3,10]; charge and isotopic-peak counts are annotated and monoisotopic intensity is summed. The deisotoped result is kept only if it still exceeds the post-filter peak count or labeled is true.
| [in,out] | exp | Input data (MS1 + MS2). Modified in place. |
| [in] | fragment_mass_tolerance | Peak mass tolerance used by deisotoping (ignored if deisotope is false). |
| [in] | fragment_mass_tolerance_unit_ppm | Interpret fragment_mass_tolerance as ppm (true) or Da (false). |
| [in] | peptide_min_size | Lower bound on peak count: spectra must have at least 2 * peptide_min_size peaks both before and after the WindowMower / Deisotoper step. |
| [in] | min_precursor_charge | Minimum allowed precursor charge for unlabeled data. |
| [in] | max_precursor_charge | Maximum allowed precursor charge for unlabeled data. |
| [in] | deisotope | If true, deisotope each surviving spectrum. |
| [in] | labeled | If true, bypass precursor/peak-count filters and the WindowMower, keeping every MS2 spectrum. |
labeled inputs this contains every input MS2 spectrum.