Fit peaks to an Exponentially Modified Gaussian (EMG) model using gradient descent.
More...
|
| | EmgGradientDescent () |
| | Constructor.
|
| |
| | ~EmgGradientDescent () override=default |
| | Destructor.
|
| |
| void | getDefaultParameters (Param ¶ms) |
| |
| template<typename PeakContainerT > |
| void | fitEMGPeakModel (const PeakContainerT &input_peak, PeakContainerT &output_peak, const double left_pos=0.0, const double right_pos=0.0) const |
| | Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model.
|
| |
| UInt | estimateEmgParameters (const std::vector< double > &xs, const std::vector< double > &ys, double &best_h, double &best_mu, double &best_sigma, double &best_tau) const |
| | The implementation of the gradient descent algorithm for the EMG peak model.
|
| |
| void | applyEstimatedParameters (const std::vector< double > &xs, const double h, const double mu, const double sigma, const double tau, std::vector< double > &out_xs, std::vector< double > &out_ys) const |
| | Compute the EMG function on a set of points.
|
| |
Public Member Functions inherited from DefaultParamHandler |
| | DefaultParamHandler (const String &name) |
| | Constructor with name that is displayed in error messages.
|
| |
| | DefaultParamHandler (const DefaultParamHandler &rhs) |
| | Copy constructor.
|
| |
| virtual | ~DefaultParamHandler () |
| | Destructor.
|
| |
| DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
| | Assignment operator.
|
| |
| virtual bool | operator== (const DefaultParamHandler &rhs) const |
| | Equality operator.
|
| |
| void | setParameters (const Param ¶m) |
| | Sets the parameters.
|
| |
| const Param & | getParameters () const |
| | Non-mutable access to the parameters.
|
| |
| const Param & | getDefaults () const |
| | Non-mutable access to the default parameters.
|
| |
| const String & | getName () const |
| | Non-mutable access to the name.
|
| |
| void | setName (const String &name) |
| | Mutable access to the name.
|
| |
| const std::vector< String > & | getSubsections () const |
| | Non-mutable access to the registered subsections.
|
| |
|
| void | updateMembers_ () override |
| | This method is used to update extra member variables at the end of the setParameters() method.
|
| |
| void | extractTrainingSet (const std::vector< double > &xs, const std::vector< double > &ys, std::vector< double > &TrX, std::vector< double > &TrY) const |
| | Given a peak, extract a training set to be used with the gradient descent algorithm.
|
| |
| double | computeMuMaxDistance (const std::vector< double > &xs) const |
| | Compute the boundary for the mean (mu) parameter in gradient descent.
|
| |
| double | computeInitialMean (const std::vector< double > &xs, const std::vector< double > &ys) const |
| | Compute an estimation of the mean of a peak.
|
| |
Protected Member Functions inherited from DefaultParamHandler |
| void | defaultsToParam_ () |
| | Updates the parameters after the defaults have been set in the constructor.
|
| |
|
| void | iRpropPlus (const double prev_diff_E_param, double &diff_E_param, double ¶m_lr, double ¶m_update, double ¶m, const double current_E, const double previous_E) const |
| | Apply the iRprop+ algorithm for gradient descent.
|
| |
| double | Loss_function (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
| | Compute the cost given by loss function E.
|
| |
| double | E_wrt_h (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
| | Compute the cost given by the partial derivative of the loss function E, with respect to h (the amplitude)
|
| |
| double | E_wrt_mu (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
| | Compute the cost given by the partial derivative of the loss function E, with respect to mu (the mean)
|
| |
| double | E_wrt_sigma (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
| | Compute the cost given by the partial derivative of the loss function E, with respect to sigma (the standard deviation)
|
| |
| double | E_wrt_tau (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
| | Compute the cost given by the partial derivative of the loss function E, with respect to tau (the exponent relaxation time)
|
| |
| double | compute_z (const double x, const double mu, const double sigma, const double tau) const |
| | Compute EMG's z parameter.
|
| |
| double | emg_point (const double x, const double h, const double mu, const double sigma, const double tau) const |
| | Compute the EMG function on a single point.
|
| |
Fit peaks to an Exponentially Modified Gaussian (EMG) model using gradient descent.
The exponentially modified Gaussian (EMG) function is a peak model used to accurately fit chromatographic and spectral peaks, especially those showing tailing behavior. This class provides methods to fit EMG parameters using gradient descent and to compute peak areas based on the fitted model.
The EMG Model
The EMG model combines a Gaussian distribution with exponential decay, characterized by four parameters:
- h (amplitude): Peak height
- mu (mean): Gaussian center position
- sigma (standard deviation): Gaussian width
- tau (exponential relaxation time): Controls the degree of tailing
Algorithm
The fitting is performed using the iRprop+ algorithm, a variant of resilient backpropagation that adapts step sizes based on gradient behavior:
- If gradient sign is consistent: increase step size (accelerate)
- If gradient sign changes: decrease step size and revert (avoid oscillation)
- Independent step sizes for each parameter
The algorithm automatically extracts a training set that avoids saturated points and handles three different z-value regimes to maintain numerical stability.
Use Cases
EMG fitting is particularly useful for:
- Saturated peaks: Reconstruct true peak shape when detector saturates
- Cutoff peaks: Estimate full area when acquisition window is incomplete
- Tailing peaks: Model asymmetric peaks common in chromatography
- Quality assessment: Compare measured vs. ideal peak shape
Reference
Yuri Kalambet, Yuri Kozmin, Ksenia Mikhailova, Igor Nagaev, Pavel Tikhonov, "Reconstruction of chromatographic peaks using the exponentially modified
Gaussian function," Journal of Chemometrics, 2011, 25, 352-356.
- See also
- PeakIntegrator for integration using EMG fitting
| double compute_z |
( |
const double |
x, |
|
|
const double |
mu, |
|
|
const double |
sigma, |
|
|
const double |
tau |
|
) |
| const |
|
private |
Compute EMG's z parameter.
The value of z decides which formula is to be used during EMG function computation. Z values in the following ranges will each use a different EMG formula to avoid numerical instability and potential numerical overflow: (-inf, 0), [0, 6.71e7], (6.71e7, +inf)
Reference: Kalambet, Y.; Kozmin, Y.; Mikhailova, K.; Nagaev, I.; Tikhonov, P. (2011). "Reconstruction of chromatographic peaks using the exponentially modified
Gaussian function". Journal of Chemometrics. 25 (7): 352.
- Parameters
-
| [in] | x | Position |
| [in] | mu | Mean |
| [in] | sigma | Standard deviation |
| [in] | tau | Exponent relaxation time |
- Returns
- The computed parameter z
Referenced by EmgGradientDescent_friend::compute_z().
| void extractTrainingSet |
( |
const std::vector< double > & |
xs, |
|
|
const std::vector< double > & |
ys, |
|
|
std::vector< double > & |
TrX, |
|
|
std::vector< double > & |
TrY |
|
) |
| const |
|
protected |
Given a peak, extract a training set to be used with the gradient descent algorithm.
The algorithm tries to select only those points that can help in finding the optimal parameters with gradient descent. The decision of which points to skip is based on the derivatives between consecutive points.
It first selects all those points whose intensity is below a certain value (intensity_threshold). Then, the derivatives of all the remaining points are computed. Based on the results, the algorithm selects those points that present a high enough derivative. Once a low value is found, the algorithm stops taking points from that side. It then repeats the same procedure on the other side of the peak. The goal is to limit the inclusion of saturated or spurious points near the peak apex during training.
- Exceptions
-
- Parameters
-
| [in] | xs | Positions |
| [in] | ys | Intensities |
| [out] | TrX | Extracted training set positions |
| [out] | TrY | Extracted training set intensities |
Referenced by EmgGradientDescent_friend::extractTrainingSet().
template<typename PeakContainerT >
| void fitEMGPeakModel |
( |
const PeakContainerT & |
input_peak, |
|
|
PeakContainerT & |
output_peak, |
|
|
const double |
left_pos = 0.0, |
|
|
const double |
right_pos = 0.0 |
|
) |
| const |
Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model.
The method is able to recapitulate the actual peak area of saturated or cutoff peaks. In addition, the method is able to fine tune the peak area of well acquired peaks. The output is a reconstruction of the input peak. Additional points are often added to produce a peak with similar intensities on boundaries' points.
Metadata will be added to the output peak, containing the optimal parameters for the EMG peak model. This information will be found in a FloatDataArray of name "emg_parameters", with the parameters being saved in the following order (from index 0 to 3): amplitude h, mean mu, standard deviation sigma, exponent relaxation time tau.
If left_pos and right_pos are passed, then only that part of the peak is taken into consideration.
- Note
- All optimal gradient descent parameters are currently hard coded to allow for a simplified user interface
-
Cutoff peak: The intensities of the left and right baselines are not equal
-
Saturated peak: The maximum intensity of the peak is lower than expected due to saturation of the detector
Inspired by the results found in: Yuri Kalambet, Yuri Kozmin, Ksenia Mikhailova, Igor Nagaev, Pavel Tikhonov Reconstruction of chromatographic peaks using the exponentially modified Gaussian function
- Template Parameters
-
- Parameters
-
| [in] | input_peak | Input peak |
| [out] | output_peak | Output peak |
| [in] | left_pos | RT or MZ value of the first point of interest |
| [in] | right_pos | RT or MZ value of the last point of interest |
Referenced by PeakIntegrator::EMGPreProcess_().