OpenMS  2.7.0
Classes | Public Member Functions | Protected Attributes | List of all members
PeakPickerMaxima Class Reference

This class implements a fast peak-picking algorithm best suited for high resolution MS data (FT-ICR-MS, Orbitrap). In high resolution data, the signals of ions with similar mass-to-charge ratios (m/z) exhibit little or no overlapping and therefore allow for a clear separation. Furthermore, ion signals tend to show well-defined peak shapes with narrow peak width. More...

#include <OpenMS/TRANSFORMATIONS/RAW2PEAK/PeakPickerMaxima.h>

Collaboration diagram for PeakPickerMaxima:
[legend]

Classes

struct  PeakCandidate
 The PeakCandidate describes the output of the peak picker. More...
 

Public Member Functions

 PeakPickerMaxima (double signal_to_noise, double spacing_difference=1.5, double spacing_difference_gap=4.0, double sn_window_length=200, unsigned missing=2)
 Constructor. More...
 
virtual ~PeakPickerMaxima ()
 Destructor. More...
 
void findMaxima (const std::vector< double > &mz_array, const std::vector< double > &int_array, std::vector< PeakCandidate > &pc, bool check_spacings=true)
 Will find local maxima in raw data. More...
 
void pick (std::vector< double > &mz_array, std::vector< double > &int_array, std::vector< PeakCandidate > &pc, bool check_spacings=true)
 Will pick peaks in a spectrum. More...
 

Protected Attributes

double signal_to_noise_
 
double sn_window_length_
 
double spacing_difference_
 
double spacing_difference_gap_
 
unsigned missing_
 

Detailed Description

This class implements a fast peak-picking algorithm best suited for high resolution MS data (FT-ICR-MS, Orbitrap). In high resolution data, the signals of ions with similar mass-to-charge ratios (m/z) exhibit little or no overlapping and therefore allow for a clear separation. Furthermore, ion signals tend to show well-defined peak shapes with narrow peak width.

This peak-picking algorithm detects ion signals in raw data and reconstructs the corresponding peak shape by cubic spline interpolation. Signal detection depends on the signal-to-noise ratio which is adjustable by the user (see parameter signal_to_noise). A picked peak's m/z and intensity value is given by the maximum of the underlying peak spline.

So far, this peak picker was mainly tested on high resolution data. With appropriate preprocessing steps (e.g. noise reduction and baseline subtraction), it might be also applied to low resolution data.

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
signal_to_noise float0.0 min: 0.0Minimal signal-to-noise ratio for a peak to be picked (0.0 disables SNT estimation!)
spacing_difference_gap float4.0 min: 0.0The extension of a peak is stopped if the spacing between two subsequent data points exceeds 'spacing_difference_gap * min_spacing'. 'min_spacing' is the smaller of the two spacings from the peak apex to its two neighboring points. '0' to disable the constraint. Not applicable to chromatograms.
spacing_difference float1.5 min: 0.0Maximum allowed difference between points during peak extension, in multiples of the minimal difference between the peak apex and its two neighboring points. If this difference is exceeded a missing point is assumed (see parameter 'missing'). A higher value implies a less stringent peak definition, since individual signals within the peak are allowed to be further apart. '0' to disable the constraint. Not applicable to chromatograms.
missing int1 min: 0Maximum number of missing points allowed when extending a peak to the left or to the right. A missing data point occurs if the spacing between two subsequent data points exceeds 'spacing_difference * min_spacing'. 'min_spacing' is the smaller of the two spacings from the peak apex to its two neighboring points. Not applicable to chromatograms.
ms_levels int list[] min: 1List of MS levels for which the peak picking is applied. If empty, auto mode is enabled, all peaks which aren't picked yet will get picked. Other scans are copied to the output without changes.
report_FWHM stringfalse true, falseAdd metadata for FWHM (as floatDataArray named 'FWHM' or 'FWHM_ppm', depending on param 'report_FWHM_unit') for each picked peak.
report_FWHM_unit stringrelative relative, absoluteUnit of FWHM. Either absolute in the unit of input, e.g. 'm/z' for spectra, or relative as ppm (only sensible for spectra, not chromatograms).
SignalToNoise:max_intensity int-1 min: -1maximal intensity considered for histogram construction. By default, it will be calculated automatically (see auto_mode). Only provide this parameter if you know what you are doing (and change 'auto_mode' to '-1')! All intensities EQUAL/ABOVE 'max_intensity' will be added to the LAST histogram bin. If you choose 'max_intensity' too small, the noise estimate might be too small as well. If chosen too big, the bins become quite large (which you could counter by increasing 'bin_count', which increases runtime). In general, the Median-S/N estimator is more robust to a manual max_intensity than the MeanIterative-S/N.
SignalToNoise:auto_max_stdev_factor float3.0 min: 0.0 max: 999.0parameter for 'max_intensity' estimation (if 'auto_mode' == 0): mean + 'auto_max_stdev_factor' * stdev
SignalToNoise:auto_max_percentile int95 min: 0 max: 100parameter for 'max_intensity' estimation (if 'auto_mode' == 1): auto_max_percentile th percentile
SignalToNoise:auto_mode int0 min: -1 max: 1method to use to determine maximal intensity: -1 --> use 'max_intensity'; 0 --> 'auto_max_stdev_factor' method (default); 1 --> 'auto_max_percentile' method
SignalToNoise:win_len float200.0 min: 1.0window length in Thomson
SignalToNoise:bin_count int30 min: 3number of bins for intensity values
SignalToNoise:min_required_elements int10 min: 1minimum number of elements required in a window (otherwise it is considered sparse)
SignalToNoise:noise_for_empty_window float1.0e20  noise value used for sparse windows
SignalToNoise:write_log_messages stringtrue true, falseWrite out log messages in case of sparse windows or median in rightmost histogram bin

Note:
Note
The peaks must be sorted according to ascending m/z!

Class Documentation

◆ OpenMS::PeakPickerMaxima::PeakCandidate

struct OpenMS::PeakPickerMaxima::PeakCandidate

The PeakCandidate describes the output of the peak picker.

It contains the m/z and intensity value of the peak candidate.

It also contains the original index in the m/z axis where the peak was found as well as an estimate of its right and left boundary.

Collaboration diagram for PeakPickerMaxima::PeakCandidate:
[legend]
Class Members
double int_max intensity value of the peak apex
int left_boundary index of the left boundary (relative to the input data)
double mz_max m/z value of the peak apex
int pos index of the peak apex (relative to the input data)
int right_boundary index of the right boundary (relative to the input data)

Constructor & Destructor Documentation

◆ PeakPickerMaxima()

PeakPickerMaxima ( double  signal_to_noise,
double  spacing_difference = 1.5,
double  spacing_difference_gap = 4.0,
double  sn_window_length = 200,
unsigned  missing = 2 
)

Constructor.

◆ ~PeakPickerMaxima()

virtual ~PeakPickerMaxima ( )
inlinevirtual

Destructor.

Member Function Documentation

◆ findMaxima()

void findMaxima ( const std::vector< double > &  mz_array,
const std::vector< double > &  int_array,
std::vector< PeakCandidate > &  pc,
bool  check_spacings = true 
)

Will find local maxima in raw data.

Parameters
mz_arrayThe array containing m/z values
int_arrayThe array containing intensity values
pcThe resulting array containing the peak candidates
check_spacingscheck spacing constraints? (recommended settings: yes for spectra, no for chromatograms)
Note
This function will directly report peak apices with right and left boundaries but will not use any fitting to estimate the true m/z and intensity of the peak. Note that the mz_max and int_max fields will be empty in the result (set to -1).

◆ pick()

void pick ( std::vector< double > &  mz_array,
std::vector< double > &  int_array,
std::vector< PeakCandidate > &  pc,
bool  check_spacings = true 
)

Will pick peaks in a spectrum.

Parameters
mz_arrayThe array containing m/z values
int_arrayThe array containing intensity values
pcThe resulting array containing the peak candidates
check_spacingscheck spacing constraints? (recommended settings: yes for spectra, no for chromatograms)
Note
This function will first find maxima in the intensity domain and then use a spline function to estimate the best m/z and intensity for each peak candidate.

Member Data Documentation

◆ missing_

unsigned missing_
protected

◆ signal_to_noise_

double signal_to_noise_
protected

◆ sn_window_length_

double sn_window_length_
protected

◆ spacing_difference_

double spacing_difference_
protected

◆ spacing_difference_gap_

double spacing_difference_gap_
protected