OpenMS

A map alignment algorithm based on peptide identifications from MS2 spectra. More...
#include <OpenMS/ANALYSIS/MAPMATCHING/MapAlignmentAlgorithmTreeGuided.h>
Public Member Functions  
MapAlignmentAlgorithmTreeGuided ()  
Default constructor. More...  
~MapAlignmentAlgorithmTreeGuided () override  
Destructor. More...  
void  treeGuidedAlignment (const std::vector< BinaryTreeNode > &tree, std::vector< FeatureMap > &feature_maps_transformed, std::vector< std::vector< double >> &maps_ranges, FeatureMap &map_transformed, std::vector< Size > &trafo_order) 
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference. More...  
void  align (std::vector< FeatureMap > &data, std::vector< TransformationDescription > &transformations) 
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference. More...  
void  computeTrafosByOriginalRT (std::vector< FeatureMap > &feature_maps, FeatureMap &map_transformed, std::vector< TransformationDescription > &transformations, const std::vector< Size > &trafo_order) 
Extract original RT ("original_RT" MetaInfo) and transformed RT for each feature to compute RT transformations. More...  
Public Member Functions inherited from DefaultParamHandler  
DefaultParamHandler (const String &name)  
Constructor with name that is displayed in error messages. More...  
DefaultParamHandler (const DefaultParamHandler &rhs)  
Copy constructor. More...  
virtual  ~DefaultParamHandler () 
Destructor. More...  
DefaultParamHandler &  operator= (const DefaultParamHandler &rhs) 
Assignment operator. More...  
virtual bool  operator== (const DefaultParamHandler &rhs) const 
Equality operator. More...  
void  setParameters (const Param ¶m) 
Sets the parameters. More...  
const Param &  getParameters () const 
Nonmutable access to the parameters. More...  
const Param &  getDefaults () const 
Nonmutable access to the default parameters. More...  
const String &  getName () const 
Nonmutable access to the name. More...  
void  setName (const String &name) 
Mutable access to the name. More...  
const std::vector< String > &  getSubsections () const 
Nonmutable access to the registered subsections. More...  
Public Member Functions inherited from ProgressLogger  
ProgressLogger ()  
Constructor. More...  
virtual  ~ProgressLogger () 
Destructor. More...  
ProgressLogger (const ProgressLogger &other)  
Copy constructor. More...  
ProgressLogger &  operator= (const ProgressLogger &other) 
Assignment Operator. More...  
void  setLogType (LogType type) const 
Sets the progress log that should be used. The default type is NONE! More...  
LogType  getLogType () const 
Returns the type of progress log being used. More...  
void  setLogger (ProgressLoggerImpl *logger) 
Sets the logger to be used for progress logging. More...  
void  startProgress (SignedSize begin, SignedSize end, const String &label) const 
Initializes the progress display. More...  
void  setProgress (SignedSize value) const 
Sets the current progress. More...  
void  endProgress (UInt64 bytes_processed=0) const 
void  nextProgress () const 
increment progress by 1 (according to range beginend) More...  
Static Public Member Functions  
static void  buildTree (std::vector< FeatureMap > &feature_maps, std::vector< BinaryTreeNode > &tree, std::vector< std::vector< double >> &maps_ranges) 
Extract RTs given for individual features of each map, calculate distances for each pair of maps and cluster hierarchical using average linkage. More...  
static void  computeTransformedFeatureMaps (std::vector< FeatureMap > &feature_maps, const std::vector< TransformationDescription > &transformations) 
Apply transformations on input maps. More...  
Static Public Member Functions inherited from DefaultParamHandler  
static void  writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") 
Writes all parameters to meta values. More...  
Protected Types  
typedef std::map< String, DoubleList >  SeqAndRTList 
Type to store feature retention times given for individual peptide sequence. More...  
Protected Member Functions  
void  updateMembers_ () override 
This method is used to update extra member variables at the end of the setParameters() method. More...  
Protected Member Functions inherited from DefaultParamHandler  
void  defaultsToParam_ () 
Updates the parameters after the defaults have been set in the constructor. More...  
Static Protected Member Functions  
static void  addPeptideSequences_ (const std::vector< PeptideIdentification > &peptides, SeqAndRTList &peptide_rts, std::vector< double > &map_range, double feature_rt) 
For given peptide identifications extract sequences and store with associated feature RT. More...  
static void  extractSeqAndRt_ (const std::vector< FeatureMap > &feature_maps, std::vector< SeqAndRTList > &maps_seq_and_rt, std::vector< std::vector< double >> &maps_ranges) 
For each input map, extract peptide identifications (sequences) of existing features with associated feature RT. More...  
Protected Attributes  
String  model_type_ 
Type of transformation model. More...  
Param  model_param_ 
Default params of transformation models linear, b_spline, lowess and interpolated. More...  
MapAlignmentAlgorithmIdentification  align_algorithm_ 
Instantiation of alignment algorithm. More...  
Protected Attributes inherited from DefaultParamHandler  
Param  param_ 
Container for current parameters. More...  
Param  defaults_ 
Container for default parameters. This member should be filled in the constructor of derived classes! More...  
std::vector< String >  subsections_ 
Container for registered subsections. This member should be filled in the constructor of derived classes! More...  
String  error_name_ 
Name that is displayed in error messages during the parameter checking. More...  
bool  check_defaults_ 
If this member is set to false no checking if parameters in done;. More...  
bool  warn_empty_defaults_ 
If this member is set to false no warning is emitted when defaults are empty;. More...  
Protected Attributes inherited from ProgressLogger  
LogType  type_ 
time_t  last_invoke_ 
ProgressLoggerImpl *  current_logger_ 
Private Member Functions  
MapAlignmentAlgorithmTreeGuided (const MapAlignmentAlgorithmTreeGuided &)  
Copy constructor intentionally not implemented > private. More...  
MapAlignmentAlgorithmTreeGuided &  operator= (const MapAlignmentAlgorithmTreeGuided &) 
Assignment operator intentionally not implemented > private. More...  
Additional Inherited Members  
Public Types inherited from ProgressLogger  
enum  LogType { CMD , GUI , NONE } 
Possible log types. More...  
Static Protected Attributes inherited from ProgressLogger  
static int  recursion_depth_ 
A map alignment algorithm based on peptide identifications from MS2 spectra.
ID groups with the same sequence in different maps represent points of correspondence in RT between the maps. They are used to evaluate the distances between the maps for hierarchical clustering and form the basis for the alignment. Only the best PSM per spectrum is considered as the correct identification.
For each pair of maps, the similarity is determined based on the intersection of the contained identifications using Pearson correlation. For small intersections, the Pearson value is reduced by multiplying the ratio of the intersection size to the union size: \(\texttt{PearsonValue(map1}\cap \texttt{map2)}*\Bigl(\frac{\texttt{N(map1 }\cap\texttt{ map2})}{\texttt{N(map1 }\cup\texttt{ map2})}\Bigr)\) Using hierarchical clustering together with average linkage a binary tree is produced. Following the tree, the maps are aligned, resulting in a transformed feature map that contains both the original and the transformed retention times. As long as there are at least two clusters, the alignment is done as follows: Of every pair of clusters, the one with the larger 10/90 percentile retention time range is selected as reference for the align() method of OpenMS::MapAlignmentAlgorithmIdentification. align() aligns the median retention time of each ID group in the second cluster to the reference retention time of this group. Cubic spline smoothing is used to convert this mapping to a smooth function. Retention times in the second cluster are transformed to the reference scale by applying this function. Additionally, the original retention times are stored in the meta information of each feature. The reference is combined with the transformed cluster.
The resulting map is used to extract transformation descriptions for each input map. For each map cubic spline smoothing is used to convert the mapping to a smooth function. Retention times of each map are transformed by applying the smoothed function.
Parameters of this class are:Name  Type  Default  Restrictions  Description 

model_type  string  b_spline  linear, b_spline, lowess, interpolated  Options to control the modeling of retention time transformations from data 
model:type  string  b_spline  linear, b_spline, lowess, interpolated  Type of model 
model:linear:symmetric_regression  string  false  true, false  Perform linear regression on 'y  x' vs. 'y + x', instead of on 'y' vs. 'x'. 
model:linear:x_weight  string  x  1/x, 1/x2, ln(x), x  Weight x values 
model:linear:y_weight  string  y  1/y, 1/y2, ln(y), y  Weight y values 
model:linear:x_datum_min  float  1.0e15  Minimum x value  
model:linear:x_datum_max  float  1.0e15  Maximum x value  
model:linear:y_datum_min  float  1.0e15  Minimum y value  
model:linear:y_datum_max  float  1.0e15  Maximum y value  
model:b_spline:wavelength  float  0.0  min: 0.0  Determines the amount of smoothing by setting the number of nodes for the Bspline. The number is chosen so that the spline approximates a lowpass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points. 
model:b_spline:num_nodes  int  5  min: 0  Number of nodes for Bspline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing. 
model:b_spline:extrapolate  string  linear  linear, b_spline, constant, global_linear  Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the Bspline at the corresponding endpoint. 'b_spline': Use the Bspline (as for interpolation). 'constant': Use the constant value of the Bspline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range). 
model:b_spline:boundary_condition  int  2  min: 0 max: 2  Boundary condition at Bspline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero) 
model:lowess:span  float  0.666666666666667  min: 0.0 max: 1.0  Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit. 
model:lowess:num_iterations  int  3  min: 0  Number of robustifying iterations for lowess fitting. 
model:lowess:delta  float  1.0  Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.  
model:lowess:interpolation_type  string  cspline  linear, cspline, akima  Method to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolation 
model:lowess:extrapolation_type  string  fourpointlinear  twopointlinear, fourpointlinear, globallinear  Method to use for extrapolation outside the data range. 'twopointlinear': Uses a line through the first and last point to extrapolate. 'fourpointlinear': Uses a line through the first and second point to extrapolate in front and and a line through the last and secondtolast point in the end. 'globallinear': Uses a linear regression to fit a line through all data points and use it for interpolation. 
model:interpolated:interpolation_type  string  cspline  linear, cspline, akima  Type of interpolation to apply. 
model:interpolated:extrapolation_type  string  twopointlinear  twopointlinear, fourpointlinear, globallinear  Type of extrapolation to apply: twopointlinear: use the first and last data point to build a single linear model, fourpointlinear: build two linear models on both ends using the first two / last two points, globallinear: use all points to build a single linear model. Note that globallinear may not be continuous at the border. 
align_algorithm:score_type  string  Name of the score type to use for ranking and filtering (.oms input only). If left empty, a score type is picked automatically.  
align_algorithm:score_cutoff  string  false  true, false  Use only IDs above a score cutoff (parameter 'min_score') for alignment? 
align_algorithm:min_score  float  0.05  If 'score_cutoff' is 'true': Minimum score for an ID to be considered. Unless you have very few runs or identifications, increase this value to focus on more informative peptides. 

align_algorithm:min_run_occur  int  2  min: 2  Minimum number of runs (incl. reference, if any) in which a peptide must occur to be used for the alignment. Unless you have very few runs or identifications, increase this value to focus on more informative peptides. 
align_algorithm:max_rt_shift  float  0.5  min: 0.0  Maximum realistic RT difference for a peptide (median per run vs. reference). Peptides with higher shifts (outliers) are not used to compute the alignment. If 0, no limit (disable filter); if > 1, the final value in seconds; if <= 1, taken as a fraction of the range of the reference RT scale. 
align_algorithm:use_unassigned_peptides  string  true  true, false  Should unassigned peptide identifications be used when computing an alignment of feature or consensus maps? If 'false', only peptide IDs assigned to features will be used. 
align_algorithm:use_feature_rt  string  true  true, false  When aligning feature or consensus maps, don't use the retention time of a peptide identification directly; instead, use the retention time of the centroid of the feature (apex of the elution profile) that the peptide was matched to. If different identifications are matched to one feature, only the peptide closest to the centroid in RT is used. Precludes 'use_unassigned_peptides'. 
align_algorithm:use_adducts  string  true  true, false  If IDs contain adducts, treat differently adducted variants of the same molecule as different. 

protected 
Type to store feature retention times given for individual peptide sequence.
Default constructor.

override 
Destructor.

private 
Copy constructor intentionally not implemented > private.

staticprotected 
For given peptide identifications extract sequences and store with associated feature RT.
peptides  Vector of peptide identifications to extract sequences. 
peptide_rts  Map to store a list of feature RTs for each peptide sequence as key. 
map_range  Vector in which all feature RTs are stored for given peptide identifications. 
feature_rt  RT value of the feature to which the peptide identifications to be analysed belong. 
void align  (  std::vector< FeatureMap > &  data, 
std::vector< TransformationDescription > &  transformations  
) 
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference.

static 
Extract RTs given for individual features of each map, calculate distances for each pair of maps and cluster hierarchical using average linkage.
feature_maps  Vector of input maps (FeatureMap) whose distance is to be calculated. 
tree  Vector of BinaryTreeNodes that will be computed 
maps_ranges  Vector to store all sorted RTs of extracted identifications for each map in feature_maps ; needed to determine the 10/90 percentiles 
void computeTrafosByOriginalRT  (  std::vector< FeatureMap > &  feature_maps, 
FeatureMap &  map_transformed,  
std::vector< TransformationDescription > &  transformations,  
const std::vector< Size > &  trafo_order  
) 
Extract original RT ("original_RT" MetaInfo) and transformed RT for each feature to compute RT transformations.
feature_maps  Vector of input maps for size information. 
map_transformed  FeatureMap that contains all features of combined maps with original and transformed RTs in order of alignment. 
transformations  Vector to store transformation descriptions for each map. (output) 
trafo_order  Vector that contains the indices of aligned maps in order of alignment. 

static 
Apply transformations on input maps.
feature_maps  Vector of maps to be transformed (output) 
transformations  Vector that contains TransformationDescriptions that are applied to input maps 

staticprotected 
For each input map, extract peptide identifications (sequences) of existing features with associated feature RT.
feature_maps  Vector of original maps containing peptide identifications. 
maps_seq_and_rt  Vector of maps to store feature RTs given for individual peptide sequences for each feature map. 
maps_ranges  Vector to store all feature RTs of extracted identifications for each map; needed to determine the 10/90 percentiles. 

private 
Assignment operator intentionally not implemented > private.
void treeGuidedAlignment  (  const std::vector< BinaryTreeNode > &  tree, 
std::vector< FeatureMap > &  feature_maps_transformed,  
std::vector< std::vector< double >> &  maps_ranges,  
FeatureMap &  map_transformed,  
std::vector< Size > &  trafo_order  
) 
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference.
tree  Vector of BinaryTreeNodes that contains order for alignment. 
feature_maps_transformed  Vector with input maps for transformation process. Because the transformed maps are stored within this vector it's not const. 
maps_ranges  Vector that contains all sorted RTs of extracted identifications for each map; needed to determine the 10/90 percentiles. 
map_transformed  FeatureMap to store all features of combined maps with original and transformed RTs in order of alignment. 
trafo_order  Vector to store indices of maps in order of alignment. 

overrideprotectedvirtual 
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.

protected 
Instantiation of alignment algorithm.

protected 
Default params of transformation models linear, b_spline, lowess and interpolated.

protected 
Type of transformation model.