OpenMS
MapAlignmentAlgorithmTreeGuided Class Reference

A map alignment algorithm based on peptide identifications from MS2 spectra. More...

#include <OpenMS/ANALYSIS/MAPMATCHING/MapAlignmentAlgorithmTreeGuided.h>

Inheritance diagram for MapAlignmentAlgorithmTreeGuided:
[legend]
Collaboration diagram for MapAlignmentAlgorithmTreeGuided:
[legend]

Public Member Functions

 MapAlignmentAlgorithmTreeGuided ()
 Default constructor. More...
 
 ~MapAlignmentAlgorithmTreeGuided () override
 Destructor. More...
 
void treeGuidedAlignment (const std::vector< BinaryTreeNode > &tree, std::vector< FeatureMap > &feature_maps_transformed, std::vector< std::vector< double >> &maps_ranges, FeatureMap &map_transformed, std::vector< Size > &trafo_order)
 Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference. More...
 
void align (std::vector< FeatureMap > &data, std::vector< TransformationDescription > &transformations)
 Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference. More...
 
void computeTrafosByOriginalRT (std::vector< FeatureMap > &feature_maps, FeatureMap &map_transformed, std::vector< TransformationDescription > &transformations, const std::vector< Size > &trafo_order)
 Extract original RT ("original_RT" MetaInfo) and transformed RT for each feature to compute RT transformations. More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void setLogger (ProgressLoggerImpl *logger)
 Sets the logger to be used for progress logging. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Static Public Member Functions

static void buildTree (std::vector< FeatureMap > &feature_maps, std::vector< BinaryTreeNode > &tree, std::vector< std::vector< double >> &maps_ranges)
 Extract RTs given for individual features of each map, calculate distances for each pair of maps and cluster hierarchical using average linkage. More...
 
static void computeTransformedFeatureMaps (std::vector< FeatureMap > &feature_maps, const std::vector< TransformationDescription > &transformations)
 Apply transformations on input maps. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 

Protected Types

typedef std::map< String, DoubleListSeqAndRTList
 Type to store feature retention times given for individual peptide sequence. More...
 

Protected Member Functions

void updateMembers_ () override
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 

Static Protected Member Functions

static void addPeptideSequences_ (const std::vector< PeptideIdentification > &peptides, SeqAndRTList &peptide_rts, std::vector< double > &map_range, double feature_rt)
 For given peptide identifications extract sequences and store with associated feature RT. More...
 
static void extractSeqAndRt_ (const std::vector< FeatureMap > &feature_maps, std::vector< SeqAndRTList > &maps_seq_and_rt, std::vector< std::vector< double >> &maps_ranges)
 For each input map, extract peptide identifications (sequences) of existing features with associated feature RT. More...
 

Protected Attributes

String model_type_
 Type of transformation model. More...
 
Param model_param_
 Default params of transformation models linear, b_spline, lowess and interpolated. More...
 
MapAlignmentAlgorithmIdentification align_algorithm_
 Instantiation of alignment algorithm. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 

Private Member Functions

 MapAlignmentAlgorithmTreeGuided (const MapAlignmentAlgorithmTreeGuided &)
 Copy constructor intentionally not implemented -> private. More...
 
MapAlignmentAlgorithmTreeGuidedoperator= (const MapAlignmentAlgorithmTreeGuided &)
 Assignment operator intentionally not implemented -> private. More...
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

A map alignment algorithm based on peptide identifications from MS2 spectra.

ID groups with the same sequence in different maps represent points of correspondence in RT between the maps. They are used to evaluate the distances between the maps for hierarchical clustering and form the basis for the alignment. Only the best PSM per spectrum is considered as the correct identification.

For each pair of maps, the similarity is determined based on the intersection of the contained identifications using Pearson correlation. For small intersections, the Pearson value is reduced by multiplying the ratio of the intersection size to the union size: \(\texttt{PearsonValue(map1}\cap \texttt{map2)}*\Bigl(\frac{\texttt{N(map1 }\cap\texttt{ map2})}{\texttt{N(map1 }\cup\texttt{ map2})}\Bigr)\) Using hierarchical clustering together with average linkage a binary tree is produced. Following the tree, the maps are aligned, resulting in a transformed feature map that contains both the original and the transformed retention times. As long as there are at least two clusters, the alignment is done as follows: Of every pair of clusters, the one with the larger 10/90 percentile retention time range is selected as reference for the align() method of OpenMS::MapAlignmentAlgorithmIdentification. align() aligns the median retention time of each ID group in the second cluster to the reference retention time of this group. Cubic spline smoothing is used to convert this mapping to a smooth function. Retention times in the second cluster are transformed to the reference scale by applying this function. Additionally, the original retention times are stored in the meta information of each feature. The reference is combined with the transformed cluster.

The resulting map is used to extract transformation descriptions for each input map. For each map cubic spline smoothing is used to convert the mapping to a smooth function. Retention times of each map are transformed by applying the smoothed function.

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
model_type stringb_spline linear, b_spline, lowess, interpolatedOptions to control the modeling of retention time transformations from data
model:type stringb_spline linear, b_spline, lowess, interpolatedType of model
model:linear:symmetric_regression stringfalse true, falsePerform linear regression on 'y - x' vs. 'y + x', instead of on 'y' vs. 'x'.
model:linear:x_weight stringx 1/x, 1/x2, ln(x), xWeight x values
model:linear:y_weight stringy 1/y, 1/y2, ln(y), yWeight y values
model:linear:x_datum_min float1.0e-15  Minimum x value
model:linear:x_datum_max float1.0e15  Maximum x value
model:linear:y_datum_min float1.0e-15  Minimum y value
model:linear:y_datum_max float1.0e15  Maximum y value
model:b_spline:wavelength float0.0 min: 0.0Determines the amount of smoothing by setting the number of nodes for the B-spline. The number is chosen so that the spline approximates a low-pass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points.
model:b_spline:num_nodes int5 min: 0Number of nodes for B-spline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing.
model:b_spline:extrapolate stringlinear linear, b_spline, constant, global_linearMethod to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the B-spline at the corresponding endpoint. 'b_spline': Use the B-spline (as for interpolation). 'constant': Use the constant value of the B-spline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range).
model:b_spline:boundary_condition int2 min: 0 max: 2Boundary condition at B-spline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero)
model:lowess:span float0.666666666666667 min: 0.0 max: 1.0Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit.
model:lowess:num_iterations int3 min: 0Number of robustifying iterations for lowess fitting.
model:lowess:delta float-1.0  Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.
model:lowess:interpolation_type stringcspline linear, cspline, akimaMethod to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolation
model:lowess:extrapolation_type stringfour-point-linear two-point-linear, four-point-linear, global-linearMethod to use for extrapolation outside the data range. 'two-point-linear': Uses a line through the first and last point to extrapolate. 'four-point-linear': Uses a line through the first and second point to extrapolate in front and and a line through the last and second-to-last point in the end. 'global-linear': Uses a linear regression to fit a line through all data points and use it for interpolation.
model:interpolated:interpolation_type stringcspline linear, cspline, akimaType of interpolation to apply.
model:interpolated:extrapolation_type stringtwo-point-linear two-point-linear, four-point-linear, global-linearType of extrapolation to apply: two-point-linear: use the first and last data point to build a single linear model, four-point-linear: build two linear models on both ends using the first two / last two points, global-linear: use all points to build a single linear model. Note that global-linear may not be continuous at the border.
align_algorithm:score_type string  Name of the score type to use for ranking and filtering (.oms input only). If left empty, a score type is picked automatically.
align_algorithm:score_cutoff stringfalse true, falseUse only IDs above a score cut-off (parameter 'min_score') for alignment?
align_algorithm:min_score float0.05  If 'score_cutoff' is 'true': Minimum score for an ID to be considered.
Unless you have very few runs or identifications, increase this value to focus on more informative peptides.
align_algorithm:min_run_occur int2 min: 2Minimum number of runs (incl. reference, if any) in which a peptide must occur to be used for the alignment.
Unless you have very few runs or identifications, increase this value to focus on more informative peptides.
align_algorithm:max_rt_shift float0.5 min: 0.0Maximum realistic RT difference for a peptide (median per run vs. reference). Peptides with higher shifts (outliers) are not used to compute the alignment.
If 0, no limit (disable filter); if > 1, the final value in seconds; if <= 1, taken as a fraction of the range of the reference RT scale.
align_algorithm:use_unassigned_peptides stringtrue true, falseShould unassigned peptide identifications be used when computing an alignment of feature or consensus maps? If 'false', only peptide IDs assigned to features will be used.
align_algorithm:use_feature_rt stringtrue true, falseWhen aligning feature or consensus maps, don't use the retention time of a peptide identification directly; instead, use the retention time of the centroid of the feature (apex of the elution profile) that the peptide was matched to. If different identifications are matched to one feature, only the peptide closest to the centroid in RT is used.
Precludes 'use_unassigned_peptides'.
align_algorithm:use_adducts stringtrue true, falseIf IDs contain adducts, treat differently adducted variants of the same molecule as different.

Note:
  • If a section name is documented, the documentation is displayed as tooltip.
  • Advanced parameter names are italic.

Member Typedef Documentation

◆ SeqAndRTList

typedef std::map<String, DoubleList> SeqAndRTList
protected

Type to store feature retention times given for individual peptide sequence.

Constructor & Destructor Documentation

◆ MapAlignmentAlgorithmTreeGuided() [1/2]

Default constructor.

◆ ~MapAlignmentAlgorithmTreeGuided()

Destructor.

◆ MapAlignmentAlgorithmTreeGuided() [2/2]

Copy constructor intentionally not implemented -> private.

Member Function Documentation

◆ addPeptideSequences_()

static void addPeptideSequences_ ( const std::vector< PeptideIdentification > &  peptides,
SeqAndRTList peptide_rts,
std::vector< double > &  map_range,
double  feature_rt 
)
staticprotected

For given peptide identifications extract sequences and store with associated feature RT.

Parameters
peptidesVector of peptide identifications to extract sequences.
peptide_rtsMap to store a list of feature RTs for each peptide sequence as key.
map_rangeVector in which all feature RTs are stored for given peptide identifications.
feature_rtRT value of the feature to which the peptide identifications to be analysed belong.

◆ align()

void align ( std::vector< FeatureMap > &  data,
std::vector< TransformationDescription > &  transformations 
)

Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference.

◆ buildTree()

static void buildTree ( std::vector< FeatureMap > &  feature_maps,
std::vector< BinaryTreeNode > &  tree,
std::vector< std::vector< double >> &  maps_ranges 
)
static

Extract RTs given for individual features of each map, calculate distances for each pair of maps and cluster hierarchical using average linkage.

Parameters
feature_mapsVector of input maps (FeatureMap) whose distance is to be calculated.
treeVector of BinaryTreeNodes that will be computed
maps_rangesVector to store all sorted RTs of extracted identifications for each map in feature_maps; needed to determine the 10/90 percentiles

◆ computeTrafosByOriginalRT()

void computeTrafosByOriginalRT ( std::vector< FeatureMap > &  feature_maps,
FeatureMap map_transformed,
std::vector< TransformationDescription > &  transformations,
const std::vector< Size > &  trafo_order 
)

Extract original RT ("original_RT" MetaInfo) and transformed RT for each feature to compute RT transformations.

Parameters
feature_mapsVector of input maps for size information.
map_transformedFeatureMap that contains all features of combined maps with original and transformed RTs in order of alignment.
transformationsVector to store transformation descriptions for each map. (output)
trafo_orderVector that contains the indices of aligned maps in order of alignment.

◆ computeTransformedFeatureMaps()

static void computeTransformedFeatureMaps ( std::vector< FeatureMap > &  feature_maps,
const std::vector< TransformationDescription > &  transformations 
)
static

Apply transformations on input maps.

Parameters
feature_mapsVector of maps to be transformed (output)
transformationsVector that contains TransformationDescriptions that are applied to input maps

◆ extractSeqAndRt_()

static void extractSeqAndRt_ ( const std::vector< FeatureMap > &  feature_maps,
std::vector< SeqAndRTList > &  maps_seq_and_rt,
std::vector< std::vector< double >> &  maps_ranges 
)
staticprotected

For each input map, extract peptide identifications (sequences) of existing features with associated feature RT.

Parameters
feature_mapsVector of original maps containing peptide identifications.
maps_seq_and_rtVector of maps to store feature RTs given for individual peptide sequences for each feature map.
maps_rangesVector to store all feature RTs of extracted identifications for each map; needed to determine the 10/90 percentiles.

◆ operator=()

Assignment operator intentionally not implemented -> private.

◆ treeGuidedAlignment()

void treeGuidedAlignment ( const std::vector< BinaryTreeNode > &  tree,
std::vector< FeatureMap > &  feature_maps_transformed,
std::vector< std::vector< double >> &  maps_ranges,
FeatureMap map_transformed,
std::vector< Size > &  trafo_order 
)

Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference.

Parameters
treeVector of BinaryTreeNodes that contains order for alignment.
feature_maps_transformedVector with input maps for transformation process. Because the transformed maps are stored within this vector it's not const.
maps_rangesVector that contains all sorted RTs of extracted identifications for each map; needed to determine the 10/90 percentiles.
map_transformedFeatureMap to store all features of combined maps with original and transformed RTs in order of alignment.
trafo_orderVector to store indices of maps in order of alignment.

◆ updateMembers_()

void updateMembers_ ( )
overrideprotectedvirtual

This method is used to update extra member variables at the end of the setParameters() method.

Also call it at the end of the derived classes' copy constructor and assignment operator.

The default implementation is empty.

Reimplemented from DefaultParamHandler.

Member Data Documentation

◆ align_algorithm_

MapAlignmentAlgorithmIdentification align_algorithm_
protected

Instantiation of alignment algorithm.

◆ model_param_

Param model_param_
protected

Default params of transformation models linear, b_spline, lowess and interpolated.

◆ model_type_

String model_type_
protected

Type of transformation model.