OpenMS

Corrects retention time distortions between maps, using information from peptides identified in different maps.
potential predecessor tools  → MapAlignerIdentification →  potential successor tools 

XTandemAdapter (or another search engine adapter)  IDMerger  
IDFileConverter  FeatureLinkerUnlabeled or FeatureLinkerUnlabeledQT  
IDMapper 
Reference:
Weisser et al.: An automated pipeline for highthroughput labelfree quantitative proteomics (J. Proteome Res., 2013, PMID: 23391308).
This tool provides an algorithm to align the retention time scales of multiple input files, correcting shifts and distortions between them. Retention time adjustment may be necessary to correct for chromatography differences e.g. before data from multiple LCMS runs can be combined (feature grouping), or when one run should be annotated with peptide identifications obtained in a different run.
All map alignment tools (MapAligner...) collect retention time data from the input files and  by fitting a model to this data  compute transformations that map all runs to a common retention time scale. They can apply the transformations right away and return output files with aligned time scales (parameter out
), and/or return descriptions of the transformations in trafoXML format (parameter trafo_out
). Transformations stored as trafoXML can be applied to arbitrary files with the MapRTTransformer tool.
The map alignment tools differ in how they obtain retention time data for the modeling of transformations, and consequently what types of data they can be applied to. The alignment algorithm implemented here is based on peptide identifications, and thus applicable to files containing peptide IDs (idXML, annotated featureXML/consensusXML). It finds peptide sequences that different input files have in common and uses them as points of correspondence between the inputs. For more details and algorithmspecific parameters (set in the INI file) see "Detailed Description" in the algorithm documentation.
Note that alignment is based on the sequence including modifications, thus an exact match is required. I.e., a peptide with oxidised methionine will not be matched to its unmodified version. This behavior is generally desired since (some) modifications can cause retention time shifts.
Since OpenMS 1.8, the extraction of data for the alignment has been separate from the modeling of RT transformations based on that data. It is now possible to use different models independently of the chosen algorithm. This algorithm has been tested mostly with the "b_spline" model. The different available models are:
The following parameters control the modeling of RT transformations (they can be set in the "model" section of the INI file):
Name  Type  Default  Restrictions  Description 

type  string  interpolated  linear, b_spline, lowess, interpolated  Type of model 
linear:symmetric_regression  string  false  true, false  Perform linear regression on 'y  x' vs. 'y + x', instead of on 'y' vs. 'x'. 
linear:x_weight  string  x  1/x, 1/x2, ln(x), x  Weight x values 
linear:y_weight  string  y  1/y, 1/y2, ln(y), y  Weight y values 
linear:x_datum_min  float  1.0e15  Minimum x value  
linear:x_datum_max  float  1.0e15  Maximum x value  
linear:y_datum_min  float  1.0e15  Minimum y value  
linear:y_datum_max  float  1.0e15  Maximum y value  
b_spline:wavelength  float  0.0  min: 0.0  Determines the amount of smoothing by setting the number of nodes for the Bspline. The number is chosen so that the spline approximates a lowpass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points. 
b_spline:num_nodes  int  5  min: 0  Number of nodes for Bspline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing. 
b_spline:extrapolate  string  linear  linear, b_spline, constant, global_linear  Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the Bspline at the corresponding endpoint. 'b_spline': Use the Bspline (as for interpolation). 'constant': Use the constant value of the Bspline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range). 
b_spline:boundary_condition  int  2  min: 0 max: 2  Boundary condition at Bspline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero) 
lowess:span  float  0.666666666666667  min: 0.0 max: 1.0  Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit. 
lowess:num_iterations  int  3  min: 0  Number of robustifying iterations for lowess fitting. 
lowess:delta  float  1.0  Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.  
lowess:interpolation_type  string  cspline  linear, cspline, akima  Method to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolation 
lowess:extrapolation_type  string  fourpointlinear  twopointlinear, fourpointlinear, globallinear  Method to use for extrapolation outside the data range. 'twopointlinear': Uses a line through the first and last point to extrapolate. 'fourpointlinear': Uses a line through the first and second point to extrapolate in front and and a line through the last and secondtolast point in the end. 'globallinear': Uses a linear regression to fit a line through all data points and use it for interpolation. 
interpolated:interpolation_type  string  cspline  linear, cspline, akima  Type of interpolation to apply. 
interpolated:extrapolation_type  string  twopointlinear  twopointlinear, fourpointlinear, globallinear  Type of extrapolation to apply: twopointlinear: use the first and last data point to build a single linear model, fourpointlinear: build two linear models on both ends using the first two / last two points, globallinear: use all points to build a single linear model. Note that globallinear may not be continuous at the border. 
The command line parameters of this tool are:
MapAlignerIdentification  Corrects retention time distortions between maps based on common peptide identifi cations. Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_MapAlignerIdentification.html Version: 3.2.0prenightly20240719 Jul 20 2024, 02:06:43, Revision: f10e72e To cite OpenMS: + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of largescale mass spec trometry data. Nat Methods (2024). doi:10.1038/s41592024021977. Usage: MapAlignerIdentification <options> This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript ion or use the helphelp option Options (mandatory options marked with '*'): in <files>* Input files to align (all must have the same file type) (valid formats: 'feature XML', 'consensusXML', 'idXML', 'oms') out <files> Output files (same file type as 'in'). This option or 'trafo_out' has to be prov ided; they can be used together. (valid formats: 'featureXML', 'consensusXML', 'idXML', 'oms') trafo_out <files> Transformation output files. This option or 'out' has to be provided; they can be used together. (valid formats: 'trafoXML') Options to define a reference file (use either 'file' or 'index', not both): reference:file <file> File to use as reference (valid formats: 'featureXML', 'consensusXML', 'idXML', 'oms') reference:index <number> Use one of the input files as reference ('1' for the first file, etc.). If '0', no explicit reference is set  the algorithm will select a reference. (default: '0') (min: '0') design <file> Input file containing the experimental design (valid formats: 'tsv') store_original_rt Store the original retention times (before transformation) as meta data in the output? Common TOPP options: ini <file> Use the given TOPP INI file threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1') write_ini <file> Writes the default configuration file help Shows options helphelp Shows all options (including advanced) The following configuration subsections are valid:  algorithm Algorithm parameters section  model Options to control the modeling of retention time transformations from data You can write an example INI file using the 'write_ini' option. Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor. For more information, please consult the online documentation for this tool:  http://www.openms.de/doxygen/nightly/html/TOPP_MapAlignerIdentification.html
INI file documentation of this tool: