OpenMS
StablePairFinder Class Reference

This class implements a pair finding algorithm for consensus features. More...

#include <OpenMS/ANALYSIS/MAPMATCHING/StablePairFinder.h>

Inheritance diagram for StablePairFinder:
[legend]
Collaboration diagram for StablePairFinder:
[legend]

Public Types

typedef BaseGroupFinder Base
 Base class. More...
 
- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 

Public Member Functions

 StablePairFinder ()
 Constructor. More...
 
 ~StablePairFinder () override
 Destructor. More...
 
void run (const std::vector< ConsensusMap > &input_maps, ConsensusMap &result_map) override
 Run the algorithm. More...
 
- Public Member Functions inherited from BaseGroupFinder
 BaseGroupFinder ()
 Default constructor. More...
 
 ~BaseGroupFinder () override
 Destructor. More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void setLogger (ProgressLoggerImpl *logger)
 Sets the logger to be used for progress logging. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Internal helper classes and enums

enum  { RT = Peak2D::RT , MZ = Peak2D::MZ }
 
double second_nearest_gap_
 The distance to the second nearest neighbors must be by this factor larger than the distance to the matched element itself. More...
 
bool use_IDs_
 Only match if peptide IDs are compatible? More...
 
void updateMembers_ () override
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
bool compatibleIDs_ (const ConsensusFeature &feat1, const ConsensusFeature &feat2) const
 Checks if the peptide IDs of two features are compatible. More...
 
const AASequencegetBestHitSequence_ (const PeptideIdentification &peptideIdentification) const
 Returns the highest scoring peptide hit in the the given peptide identification. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Protected Member Functions inherited from BaseGroupFinder
void checkIds_ (const std::vector< ConsensusMap > &maps) const
 Checks if all file descriptions have disjoint map identifiers. More...
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

This class implements a pair finding algorithm for consensus features.

It offers a method to determine pairs across two consensus maps. The corresponding consensus features must be aligned, but may have small position deviations.

The distance measure is implemented in class FeatureDistance - see there for details.

Additional criteria for pairing

Depending on parameter use_identifications, peptide identifications annotated to the features may have to be compatible (i.e. no annotation or the same annotation) for a pairing to occur.

Stability criterion: The distance to the nearest neighbor must be smaller than the distance to the second-nearest neighbor by a certain factor, see parameter second_nearest_gap. There is a non-trivial relation between this parameter and the maximum allowed difference (in RT or m/z) of the distance measure: If second_nearest_gap is greater than one, lowering max_difference may in fact lead to more - rather than fewer - pairings, because it increases the distance difference between the nearest and the second-nearest neighbor, so that the constraint imposed by second_nearest_gap may be fulfilled more often.

Quality calculation

The quality of a pairing is computed from the distance between the paired elements (nearest neighbors) and the distances to the second-nearest neighbors of both elements, according to the formula:

\[ q_{i,j} = \big( 1 - d_{i,j} \big) \cdot \big( 1 - \frac{g \cdot d_{i,j}}{d_{2,i}} \big) \cdot \big( 1 - \frac{g \cdot d_{i,j}}{d_{2,j}} \big) \cdot \]

\( q_{i,j} \) is the quality of the pairing of elements i and j, \( d_{i,j} \) is the distance between the two, \( d_{2,i} \) and \(d_{2,j} \) are the distances to the second-nearest neighbors of i and j, respectively, and g is the factor defined by parameter second_nearest_gap.

Note that by the definition of the distance measure, \( 0 \leq d_{i,j} \leq 1 \) if i and j are to form a pair. The criteria for pairing further require that \( g \cdot d_{i,j} \leq d_{2,i} \) and \( g \cdot d_{i,j} \leq d_{2,j} \). This ensures that the resulting quality is always between one (best) and zero (worst).

For the final quality q of the consensus feature produced by merging two paired elements (i and j), the existing quality values of the two elements are taken into account. The final quality is a weighted average of the existing qualities ( \( q_i \) and \( q_j \)) and the quality of the pairing ( \( q_{i,j} \), see above):

\[ q = \frac{q_{i,j} + (s_i - 1) \cdot q_i + (s_j - 1) \cdot q_j}{s_i + s_j - 1} \]

The weighting factors \( s_i \) and \( s_j \) are the sizes (i.e. numbers of subelements) of the two consensus features i and j. That way, it is possible to link several feature maps to a growing consensus map in a stepwise fashion (as done by FeatureGroupingAlgorithmUnlabeled), and in the end obtain quality values that incorporate the qualities of all pairings that occurred during the generation of a consensus feature. Note that "missing" elements (if a consensus feature does not contain sub-features from all input maps) are not punished in this definition of quality.

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
second_nearest_gap float2.0 min: 1.0Only link features whose distance to the second nearest neighbors (for both sides) is larger by 'second_nearest_gap' than the distance between the matched pair itself.
use_identifications stringfalse true, falseNever link features that are annotated with different peptides (features without ID's always match; only the best hit per peptide identification is considered).
ignore_charge stringfalse true, falsefalse [default]: pairing requires equal charge state (or at least one unknown charge '0'); true: Pairing irrespective of charge state
ignore_adduct stringtrue true, falsetrue [default]: pairing requires equal adducts (or at least one without adduct annotation); true: Pairing irrespective of adducts
distance_RT:max_difference float100.0 min: 0.0Never pair features with a larger RT distance (in seconds).
distance_RT:exponent float1.0 min: 0.0Normalized RT differences ([0-1], relative to 'max_difference') are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)
distance_RT:weight float1.0 min: 0.0Final RT distances are weighted by this factor
distance_MZ:max_difference float0.3 min: 0.0Never pair features with larger m/z distance (unit defined by 'unit')
distance_MZ:unit stringDa Da, ppmUnit of the 'max_difference' parameter
distance_MZ:exponent float2.0 min: 0.0Normalized ([0-1], relative to 'max_difference') m/z differences are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)
distance_MZ:weight float1.0 min: 0.0Final m/z distances are weighted by this factor
distance_intensity:exponent float1.0 min: 0.0Differences in relative intensity ([0-1]) are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)
distance_intensity:weight float0.0 min: 0.0Final intensity distances are weighted by this factor
distance_intensity:log_transform stringdisabled enabled, disabledLog-transform intensities? If disabled, d = |int_f2 - int_f1| / int_max. If enabled, d = |log(int_f2 + 1) - log(int_f1 + 1)| / log(int_max + 1))

Note:
  • If a section name is documented, the documentation is displayed as tooltip.
  • Advanced parameter names are italic.

Member Typedef Documentation

◆ Base

Base class.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum
protected
Enumerator
RT 
MZ 

Constructor & Destructor Documentation

◆ StablePairFinder()

Constructor.

◆ ~StablePairFinder()

~StablePairFinder ( )
inlineoverride

Destructor.

Member Function Documentation

◆ compatibleIDs_()

bool compatibleIDs_ ( const ConsensusFeature feat1,
const ConsensusFeature feat2 
) const
protected

Checks if the peptide IDs of two features are compatible.

A feature without identification is always compatible. Otherwise, two features are compatible if the best peptide hits of their identifications have the same sequences.

◆ getBestHitSequence_()

const AASequence& getBestHitSequence_ ( const PeptideIdentification peptideIdentification) const
protected

Returns the highest scoring peptide hit in the the given peptide identification.

Parameters
peptideIdentificationThe peptideIdentification to scan.

◆ run()

void run ( const std::vector< ConsensusMap > &  input_maps,
ConsensusMap result_map 
)
overridevirtual

Run the algorithm.

Note
Exactly two input maps must be provided.
Exceptions
Exception::IllegalArgumentis thrown if the input data is not valid.

Implements BaseGroupFinder.

◆ updateMembers_()

void updateMembers_ ( )
overrideprotectedvirtual

This method is used to update extra member variables at the end of the setParameters() method.

Also call it at the end of the derived classes' copy constructor and assignment operator.

The default implementation is empty.

Reimplemented from DefaultParamHandler.

Member Data Documentation

◆ second_nearest_gap_

double second_nearest_gap_
protected

The distance to the second nearest neighbors must be by this factor larger than the distance to the matched element itself.

◆ use_IDs_

bool use_IDs_
protected

Only match if peptide IDs are compatible?