Fit peaks to an Exponentially Modified Gaussian (EMG) model using gradient descent. More...

#include <OpenMS/MATH/MISC/EmgGradientDescent.h>

Inheritance diagram for EmgGradientDescent:

Collaboration diagram for EmgGradientDescent:

Public Member Functions
	EmgGradientDescent ()
	Constructor.

	~EmgGradientDescent () override=default
	Destructor.

void	getDefaultParameters (Param &params)

template<typename PeakContainerT >
void	fitEMGPeakModel (const PeakContainerT &input_peak, PeakContainerT &output_peak, const double left_pos=0.0, const double right_pos=0.0) const
	Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model.

UInt	estimateEmgParameters (const std::vector< double > &xs, const std::vector< double > &ys, double &best_h, double &best_mu, double &best_sigma, double &best_tau) const
	The implementation of the gradient descent algorithm for the EMG peak model.

void	applyEstimatedParameters (const std::vector< double > &xs, const double h, const double mu, const double sigma, const double tau, std::vector< double > &out_xs, std::vector< double > &out_ys) const
	Compute the EMG function on a set of points.

Public Member Functions inherited from DefaultParamHandler
	DefaultParamHandler (const String &name)
	Constructor with name that is displayed in error messages.

	DefaultParamHandler (const DefaultParamHandler &rhs)
	Copy constructor.

virtual	~DefaultParamHandler ()
	Destructor.

DefaultParamHandler &	operator= (const DefaultParamHandler &rhs)
	Assignment operator.

virtual bool	operator== (const DefaultParamHandler &rhs) const
	Equality operator.

void	setParameters (const Param &param)
	Sets the parameters.

const Param &	getParameters () const
	Non-mutable access to the parameters.

const Param &	getDefaults () const
	Non-mutable access to the default parameters.

const String &	getName () const
	Non-mutable access to the name.

void	setName (const String &name)
	Mutable access to the name.

const std::vector< String > &	getSubsections () const
	Non-mutable access to the registered subsections.

Protected Member Functions
void	updateMembers_ () override
	This method is used to update extra member variables at the end of the setParameters() method.

void	extractTrainingSet (const std::vector< double > &xs, const std::vector< double > &ys, std::vector< double > &TrX, std::vector< double > &TrY) const
	Given a peak, extract a training set to be used with the gradient descent algorithm.

double	computeMuMaxDistance (const std::vector< double > &xs) const
	Compute the boundary for the mean (`mu`) parameter in gradient descent.

double	computeInitialMean (const std::vector< double > &xs, const std::vector< double > &ys) const
	Compute an estimation of the mean of a peak.

Protected Member Functions inherited from DefaultParamHandler
void	defaultsToParam_ ()
	Updates the parameters after the defaults have been set in the constructor.

Private Member Functions
void	iRpropPlus (const double prev_diff_E_param, double &diff_E_param, double &param_lr, double &param_update, double &param, const double current_E, const double previous_E) const
	Apply the iRprop+ algorithm for gradient descent.

double	Loss_function (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
	Compute the cost given by loss function E.

double	E_wrt_h (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
	Compute the cost given by the partial derivative of the loss function E, with respect to `h` (the amplitude)

double	E_wrt_mu (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
	Compute the cost given by the partial derivative of the loss function E, with respect to `mu` (the mean)

double	E_wrt_sigma (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
	Compute the cost given by the partial derivative of the loss function E, with respect to `sigma` (the standard deviation)

double	E_wrt_tau (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
	Compute the cost given by the partial derivative of the loss function E, with respect to `tau` (the exponent relaxation time)

double	compute_z (const double x, const double mu, const double sigma, const double tau) const
	Compute EMG's z parameter.

double	emg_point (const double x, const double h, const double mu, const double sigma, const double tau) const
	Compute the EMG function on a single point.

Private Attributes
const double	PI = OpenMS::Constants::PI
	Alias for OpenMS::Constants:PI.

UInt	print_debug_

UInt	max_gd_iter_
	Maximum number of gradient descent iterations in `fitEMGPeakModel()`

bool	compute_additional_points_

Friends
class	EmgGradientDescent_friend
	To test private and protected methods.

Additional Inherited Members
Static Public Member Functions inherited from DefaultParamHandler
static void	writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
	Writes all parameters to meta values.

Protected Attributes inherited from DefaultParamHandler
Param	param_
	Container for current parameters.

Param	defaults_
	Container for default parameters. This member should be filled in the constructor of derived classes!

std::vector< String >	subsections_
	Container for registered subsections. This member should be filled in the constructor of derived classes!

String	error_name_
	Name that is displayed in error messages during the parameter checking.

bool	check_defaults_
	If this member is set to false no checking if parameters in done;.

bool	warn_empty_defaults_
	If this member is set to false no warning is emitted when defaults are empty;.

Detailed Description

Fit peaks to an Exponentially Modified Gaussian (EMG) model using gradient descent.

The exponentially modified Gaussian (EMG) function is a peak model used to accurately fit chromatographic and spectral peaks, especially those showing tailing behavior. This class provides methods to fit EMG parameters using gradient descent and to compute peak areas based on the fitted model.

The EMG Model

The EMG model combines a Gaussian distribution with exponential decay, characterized by four parameters:

h (amplitude): Peak height
mu (mean): Gaussian center position
sigma (standard deviation): Gaussian width
tau (exponential relaxation time): Controls the degree of tailing

Algorithm

The fitting is performed using the iRprop+ algorithm, a variant of resilient backpropagation that adapts step sizes based on gradient behavior:

If gradient sign is consistent: increase step size (accelerate)
If gradient sign changes: decrease step size and revert (avoid oscillation)
Independent step sizes for each parameter

The algorithm automatically extracts a training set that avoids saturated points and handles three different z-value regimes to maintain numerical stability.

Use Cases

EMG fitting is particularly useful for:

Saturated peaks: Reconstruct true peak shape when detector saturates
Cutoff peaks: Estimate full area when acquisition window is incomplete
Tailing peaks: Model asymmetric peaks common in chromatography
Quality assessment: Compare measured vs. ideal peak shape

Reference

Yuri Kalambet, Yuri Kozmin, Ksenia Mikhailova, Igor Nagaev, Pavel Tikhonov, "Reconstruction of chromatographic peaks using the exponentially modified Gaussian function," Journal of Chemometrics, 2011, 25, 352-356.

See also: PeakIntegrator for integration using EMG fitting

Constructor & Destructor Documentation

◆ EmgGradientDescent()

EmgGradientDescent ( )

Constructor.

◆ ~EmgGradientDescent()

~EmgGradientDescent ( )

overridedefault

Destructor.

Member Function Documentation

◆ applyEstimatedParameters()

void applyEstimatedParameters	(	const std::vector< double > &	xs,
		const double	h,
		const double	mu,
		const double	sigma,
		const double	tau,
		std::vector< double > &	out_xs,
		std::vector< double > &	out_ys
	)		const

Compute the EMG function on a set of points.

If class parameter compute_additional_points is "true", the algorithm will detect which side of the peak is cutoff and add points to it.

Parameters

[in]	xs	Positions
[in]	h	Amplitude
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time
[out]	out_xs	The output positions
[out]	out_ys	The output intensities

Referenced by EmgGradientDescent_friend::applyEstimatedParameters().

◆ compute_z()

double compute_z	(	const double	x,
		const double	mu,
		const double	sigma,
		const double	tau
	)		const

private

Compute EMG's z parameter.

The value of z decides which formula is to be used during EMG function computation. Z values in the following ranges will each use a different EMG formula to avoid numerical instability and potential numerical overflow: (-inf, 0), [0, 6.71e7], (6.71e7, +inf)

Reference: Kalambet, Y.; Kozmin, Y.; Mikhailova, K.; Nagaev, I.; Tikhonov, P. (2011). "Reconstruction of chromatographic peaks using the exponentially modified Gaussian function". Journal of Chemometrics. 25 (7): 352.

Parameters

[in]	x	Position
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time

Returns: The computed parameter z

Referenced by EmgGradientDescent_friend::compute_z().

◆ computeInitialMean()

double computeInitialMean	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys
	)		const

protected

Compute an estimation of the mean of a peak.

The method computes the middle point on different levels of intensity of the peak. The returned mean is the average of these middle points.

Exceptions

Exception::SizeUnderflow if the input is empty

Parameters

[in]	xs	Positions
[in]	ys	Intensities

Returns: The peak's estimated mean

Referenced by EmgGradientDescent_friend::computeInitialMean().

◆ computeMuMaxDistance()

double computeMuMaxDistance ( const std::vector< double > & xs ) const

protected

Compute the boundary for the mean (mu) parameter in gradient descent.

Together with the value returned by computeInitialMean(), this method decides the minimum and maximum value that mu can assume during iterations of the gradient descent algorithm. The value is based on the width of the peak.

Parameters

[in] xs Positions

Returns: The maximum distance from the precomputed initial mean in the gradient descent algorithm

Referenced by EmgGradientDescent_friend::computeMuMaxDistance().

◆ E_wrt_h()

double E_wrt_h	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys,
		const double	h,
		const double	mu,
		const double	sigma,
		const double	tau
	)		const

private

Compute the cost given by the partial derivative of the loss function E, with respect to h (the amplitude)

Needed by the gradient descent algorithm.

Parameters

[in]	xs	Positions
[in]	ys	Intensities
[in]	h	Amplitude
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time

Returns: The computed cost

◆ E_wrt_mu()

double E_wrt_mu	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys,
		const double	h,
		const double	mu,
		const double	sigma,
		const double	tau
	)		const

private

Compute the cost given by the partial derivative of the loss function E, with respect to mu (the mean)

Needed by the gradient descent algorithm.

Parameters

[in]	xs	Positions
[in]	ys	Intensities
[in]	h	Amplitude
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time

Returns: The computed cost

◆ E_wrt_sigma()

double E_wrt_sigma	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys,
		const double	h,
		const double	mu,
		const double	sigma,
		const double	tau
	)		const

private

Compute the cost given by the partial derivative of the loss function E, with respect to sigma (the standard deviation)

Needed by the gradient descent algorithm.

Parameters

[in]	xs	Positions
[in]	ys	Intensities
[in]	h	Amplitude
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time

Returns: The computed cost

◆ E_wrt_tau()

double E_wrt_tau	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys,
		const double	h,
		const double	mu,
		const double	sigma,
		const double	tau
	)		const

private

Compute the cost given by the partial derivative of the loss function E, with respect to tau (the exponent relaxation time)

Needed by the gradient descent algorithm.

Parameters

[in]	xs	Positions
[in]	ys	Intensities
[in]	h	Amplitude
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time

Returns: The computed cost

◆ emg_point()

double emg_point	(	const double	x,
		const double	h,
		const double	mu,
		const double	sigma,
		const double	tau
	)		const

private

Compute the EMG function on a single point.

Parameters

[in]	x	Position
[in]	h	Amplitude
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time

Returns: The estimated intensity for the given input point

Referenced by EmgGradientDescent_friend::emg_point().

◆ estimateEmgParameters()

UInt estimateEmgParameters	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys,
		double &	best_h,
		double &	best_mu,
		double &	best_sigma,
		double &	best_tau
	)		const

The implementation of the gradient descent algorithm for the EMG peak model.

Parameters

[in]	xs	Positions
[in]	ys	Intensities
[out]	best_h	`h` (amplitude) parameter
[out]	best_mu	`mu` (mean) parameter
[out]	best_sigma	`sigma` (standard deviation) parameter
[out]	best_tau	`tau` (exponent relaxation time) parameter

Returns: The number of iterations necessary to reach the best values for the parameters

◆ extractTrainingSet()

void extractTrainingSet	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys,
		std::vector< double > &	TrX,
		std::vector< double > &	TrY
	)		const

protected

Given a peak, extract a training set to be used with the gradient descent algorithm.

The algorithm tries to select only those points that can help in finding the optimal parameters with gradient descent. The decision of which points to skip is based on the derivatives between consecutive points.

It first selects all those points whose intensity is below a certain value (intensity_threshold). Then, the derivatives of all the remaining points are computed. Based on the results, the algorithm selects those points that present a high enough derivative. Once a low value is found, the algorithm stops taking points from that side. It then repeats the same procedure on the other side of the peak. The goal is to limit the inclusion of saturated or spurious points near the peak apex during training.

Exceptions

Exception::SizeUnderflow if the input has less than 2 elements

Parameters

[in]	xs	Positions
[in]	ys	Intensities
[out]	TrX	Extracted training set positions
[out]	TrY	Extracted training set intensities

Referenced by EmgGradientDescent_friend::extractTrainingSet().

◆ fitEMGPeakModel()

template<typename PeakContainerT >

void fitEMGPeakModel	(	const PeakContainerT &	input_peak,
		PeakContainerT &	output_peak,
		const double	left_pos = `0.0`,
		const double	right_pos = `0.0`
	)		const

Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model.

The method is able to recapitulate the actual peak area of saturated or cutoff peaks. In addition, the method is able to fine tune the peak area of well acquired peaks. The output is a reconstruction of the input peak. Additional points are often added to produce a peak with similar intensities on boundaries' points.

Metadata will be added to the output peak, containing the optimal parameters for the EMG peak model. This information will be found in a FloatDataArray of name "emg_parameters", with the parameters being saved in the following order (from index 0 to 3): amplitude h, mean mu, standard deviation sigma, exponent relaxation time tau.

If left_pos and right_pos are passed, then only that part of the peak is taken into consideration.

Note: All optimal gradient descent parameters are currently hard coded to allow for a simplified user interface; Cutoff peak: The intensities of the left and right baselines are not equal; Saturated peak: The maximum intensity of the peak is lower than expected due to saturation of the detector

Inspired by the results found in: Yuri Kalambet, Yuri Kozmin, Ksenia Mikhailova, Igor Nagaev, Pavel Tikhonov Reconstruction of chromatographic peaks using the exponentially modified Gaussian function

Template Parameters

PeakContainerT Either a MSChromatogram or a MSSpectrum

Parameters

[in]	input_peak	Input peak
[out]	output_peak	Output peak
[in]	left_pos	RT or MZ value of the first point of interest
[in]	right_pos	RT or MZ value of the last point of interest

Referenced by PeakIntegrator::EMGPreProcess_().

◆ getDefaultParameters()

void getDefaultParameters ( Param & params )

◆ iRpropPlus()

void iRpropPlus	(	const double	prev_diff_E_param,
		double &	diff_E_param,
		double &	param_lr,
		double &	param_update,
		double &	param,
		const double	current_E,
		const double	previous_E
	)		const

private

Apply the iRprop+ algorithm for gradient descent.

Reference: Christian Igel and Michael Hüsken. Improving the Rprop Learning Algorithm. Second International Symposium on Neural Computation (NC 2000), pp. 115-121, ICSC Academic Press, 2000

Parameters

[in]	prev_diff_E_param	The cost of the partial derivative of E with respect to the given parameter, at the previous iteration of gradient descent
[in,out]	diff_E_param	The cost of the partial derivative of E with respect to the given parameter, at the current iteration
[in,out]	param_lr	The learning rate for the given parameter
[in,out]	param_update	The amount to add/remove to/from `param`
[in,out]	param	The parameter for which the algorithm tries speeding the convergence to a minimum
[in]	current_E	The current cost E
[in]	previous_E	The previous cost E

Referenced by EmgGradientDescent_friend::iRpropPlus().

◆ Loss_function()

double Loss_function	(	const std::vector< double > &	xs,
		const std::vector< double > &	ys,
		const double	h,
		const double	mu,
		const double	sigma,
		const double	tau
	)		const

private

Compute the cost given by loss function E.

Needed by the gradient descent algorithm. The mean squared error is used as the loss function E.

Parameters

[in]	xs	Positions
[in]	ys	Intensities
[in]	h	Amplitude
[in]	mu	Mean
[in]	sigma	Standard deviation
[in]	tau	Exponent relaxation time

Returns: The computed cost

Referenced by EmgGradientDescent_friend::Loss_function().

◆ updateMembers_()

void updateMembers_ ( )

overrideprotectedvirtual

This method is used to update extra member variables at the end of the setParameters() method.

Also call it at the end of the derived classes' copy constructor and assignment operator.

The default implementation is empty.

Reimplemented from DefaultParamHandler.

Friends And Related Symbol Documentation

◆ EmgGradientDescent_friend

friend class EmgGradientDescent_friend

friend

To test private and protected methods.

Member Data Documentation

◆ compute_additional_points_

bool compute_additional_points_

private

Whether additional points should be added when fitting EMG peak model, particularly useful with cutoff peaks

◆ max_gd_iter_

UInt max_gd_iter_

private

Maximum number of gradient descent iterations in fitEMGPeakModel()

◆ PI

const double PI = OpenMS::Constants::PI

private

Alias for OpenMS::Constants:PI.

◆ print_debug_

UInt print_debug_

private

Level of debug information to print to the terminal Valid values are: 0, 1, 2 Higher values mean more information

Public Member Functions

Protected Member Functions

Private Member Functions

Private Attributes

Friends

Additional Inherited Members

Detailed Description

The EMG Model

Algorithm

Use Cases

Reference

Constructor & Destructor Documentation

◆ EmgGradientDescent()

◆ ~EmgGradientDescent()

Member Function Documentation

◆ applyEstimatedParameters()

◆ compute_z()

◆ computeInitialMean()

◆ computeMuMaxDistance()

◆ E_wrt_h()

◆ E_wrt_mu()

◆ E_wrt_sigma()

◆ E_wrt_tau()

◆ emg_point()

◆ estimateEmgParameters()

◆ extractTrainingSet()

◆ fitEMGPeakModel()

◆ getDefaultParameters()

◆ iRpropPlus()

◆ Loss_function()

◆ updateMembers_()

Friends And Related Symbol Documentation

◆ EmgGradientDescent_friend

Member Data Documentation

◆ compute_additional_points_

◆ max_gd_iter_

◆ PI

◆ print_debug_