Corrects retention time distortions between maps, using information from peptides identified in different maps.

potential predecessor tools	→ MapAlignerIdentification →	potential successor tools
CometAdapter (or another search engine adapter)		IDMerger
IDFileConverter		FeatureLinkerUnlabeled or FeatureLinkerUnlabeledQT
IDMapper		FeatureLinkerUnlabeled or FeatureLinkerUnlabeledQT

Reference:
Weisser et al.: An automated pipeline for high-throughput label-free quantitative proteomics (J. Proteome Res., 2013, PMID: 23391308).

This tool provides an algorithm to align the retention time scales of multiple input files, correcting shifts and distortions between them. Retention time adjustment may be necessary to correct for chromatography differences e.g. before data from multiple LC-MS runs can be combined (feature grouping), or when one run should be annotated with peptide identifications obtained in a different run.

All map alignment tools (MapAligner...) collect retention time data from the input files and - by fitting a model to this data - compute transformations that map all runs to a common retention time scale. They can apply the transformations right away and return output files with aligned time scales (parameter out), and/or return descriptions of the transformations in trafoXML format (parameter trafo_out). Transformations stored as trafoXML can be applied to arbitrary files with the MapRTTransformer tool.

The map alignment tools differ in how they obtain retention time data for the modeling of transformations, and consequently what types of data they can be applied to. The alignment algorithm implemented here is based on peptide identifications, and thus applicable to files containing peptide IDs (idXML/idparquet, annotated featureXML/featureparquet/consensusXML/consensusparquet). It finds peptide sequences that different input files have in common and uses them as points of correspondence between the inputs. For more details and algorithm-specific parameters (set in the INI file) see "Detailed Description" in the algorithm documentation.

See also: MapAlignerPoseClustering MapRTTransformer

Note that alignment is based on the sequence including modifications, thus an exact match is required. I.e., a peptide with oxidised methionine will not be matched to its unmodified version. This behavior is generally desired since (some) modifications can cause retention time shifts.

Since OpenMS 1.8, the extraction of data for the alignment has been separate from the modeling of RT transformations based on that data. It is now possible to use different models independently of the chosen algorithm. This algorithm has been tested mostly with the "b_spline" model. The different available models are:

linear: Linear model.
b_spline: Smoothing spline (non-linear).
lowess: Local regression (non-linear).
interpolated: Different types of interpolation.

The following parameters control the modeling of RT transformations (they can be set in the "model" section of the INI file):

Name	Type	Default	Restrictions	Description
type	string	interpolated	linear, b_spline, lowess, interpolated	Type of model
linear:symmetric_regression	string	false	true, false	Perform linear regression on 'y - x' vs. 'y + x', instead of on 'y' vs. 'x'.
linear:x_weight	string	x	1/x, 1/x2, ln(x), x	Weight x values
linear:y_weight	string	y	1/y, 1/y2, ln(y), y	Weight y values
linear:x_datum_min	float	1.0e-15		Minimum x value
linear:x_datum_max	float	1.0e15		Maximum x value
linear:y_datum_min	float	1.0e-15		Minimum y value
linear:y_datum_max	float	1.0e15		Maximum y value
b_spline:wavelength	float	0.0	min: 0.0	Determines the amount of smoothing by setting the number of nodes for the B-spline. The number is chosen so that the spline approximates a low-pass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points.
b_spline:num_nodes	int	5	min: 0	Number of nodes for B-spline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing.
b_spline:extrapolate	string	linear	linear, b_spline, constant, global_linear	Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the B-spline at the corresponding endpoint. 'b_spline': Use the B-spline (as for interpolation). 'constant': Use the constant value of the B-spline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range).
b_spline:boundary_condition	int	2	min: 0 max: 2	Boundary condition at B-spline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero)
lowess:span	float	0.666666666666667	min: 0.0 max: 1.0	Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit.
lowess:auto_span	string	false	true, false	If true, or if 'span' is 0, automatically select LOWESS span by cross-validation.
lowess:auto_span_min	float	0.15	min: 1.0e-03	Lower bound for auto-selected span.
lowess:auto_span_max	float	0.8	max: 0.99	Upper bound for auto-selected span.
lowess:auto_min_neighbors	int	5	min: 3	Minimum number of neighbors (span*n) enforced in auto mode.
lowess:auto_k_folds	int	5	min: 2	K-folds for CV when n>50 (else LOO is used).
lowess:auto_metric	string	mae	p90, p95, p99, rmse, mae	Metric for CV selection: one of {'p90','p95','p99','rmse','mae'}.
lowess:auto_span_grid	string			Optional explicit grid of span candidates in (0,1]. Comma-separated list, e.g. '0.2,0.3,0.5'. If empty, a default grid is used.
lowess:num_iterations	int	3	min: 0	Number of robustifying iterations for lowess fitting.
lowess:delta	float	-1.0		Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.
lowess:interpolation_type	string	cspline	linear, cspline, akima	Method to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolation
lowess:extrapolation_type	string	four-point-linear	two-point-linear, four-point-linear, global-linear	Method to use for extrapolation outside the data range. 'two-point-linear': Uses a line through the first and last point to extrapolate. 'four-point-linear': Uses a line through the first and second point to extrapolate in front and and a line through the last and second-to-last point in the end. 'global-linear': Uses a linear regression to fit a line through all data points and use it for interpolation.
interpolated:interpolation_type	string	cspline	linear, cspline, akima	Type of interpolation to apply.
interpolated:extrapolation_type	string	two-point-linear	two-point-linear, four-point-linear, global-linear	Type of extrapolation to apply: two-point-linear: use the first and last data point to build a single linear model, four-point-linear: build two linear models on both ends using the first two / last two points, global-linear: use all points to build a single linear model. Note that global-linear may not be continuous at the border.

Note: Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML/idparquet using IDFileConverter if necessary.

The command line parameters of this tool are:

MapAlignerIdentification -- Corrects retention time distortions between maps based on common peptide identifi
cations.
Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_MapAlignerIdentification.html
Version: 3.6.0-pre-nightly-2026-06-11 Jun 11 2026, 09:57:53, Revision: fc66052
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.

Usage:
  MapAlignerIdentification <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option

Options (mandatory options marked with '*'):
  -in <files>*                Input files to align (all must have the same file type) (valid formats: 'featur
                              eXML', 'featureparquet', 'consensusXML', 'consensusparquet', 'idXML', 'idparque
                              t', 'oms')
  -out <files>                Output files (same file type as 'in'). This option or 'trafo_out' has to be 
                              provided; they can be used together. (valid formats: 'featureXML', 'featureparq
                              uet', 'consensusXML', 'consensusparquet', 'idXML', 'idparquet', 'oms')
  -trafo_out <files>          Transformation output files. This option or 'out' has to be provided; they can 
                              be used together. (valid formats: 'trafoXML')
  -in_spectra_files <files>   Optional input spectra files (mzML) that will be transformed along with the 
                              alignment. Size must match the number of input files. (valid formats: 'mzML')
  -out_spectra_files <files>  Optional output spectra files (mzML) corresponding to transformed in_spectra_fi
                              les. Size must match in_spectra_files. (valid formats: 'mzML')

Options to define a reference file (use either 'file' or 'index', not both):
  -reference:file <file>      File to use as reference (valid formats: 'featureXML', 'featureparquet', 'conse
                              nsusXML', 'consensusparquet', 'idXML', 'idparquet', 'oms')
  -reference:index <number>   Use one of the input files as reference ('1' for the first file, etc.).
                              If '0', no explicit reference is set - the algorithm will select a reference. 
                              (default: '0') (min: '0')

  -design <file>              Input file containing the experimental design (valid formats: 'tsv')
  -store_original_rt          Store the original retention times (before transformation) as meta data in the 
                              output?
                              
Common TOPP options:
  -ini <file>                 Use the given TOPP INI file
  -threads <n>                Sets the number of threads allowed to be used by the TOPP tool (0 = all availab
                              le cores) (default: '1')
  -write_ini <file>           Writes the default configuration file
  --help                      Shows options
  --helphelp                  Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Algorithm parameters section
 - model       Options to control the modeling of retention time transformations from data

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
For more information, please consult the online documentation for this tool:
  - http://www.openms.de/doxygen/nightly/html/TOPP_MapAlignerIdentification.html

INI file documentation of this tool:

Legend:

required parameter

advanced parameter

This section lists all parameters supported by the tool. Parameters are organized into hierarchical subsections that group related settings together. Subsections may contain further subsections or individual parameters.

Each parameter entry contains the following information:

Name The identifier used in configuration files and on the command line.
Default value The value used if the parameter is not explicitly specified.
Description A short explanation describing the purpose and behavior of the parameter.
Tags Additional metadata associated with the parameter.
Restrictions Allowed value ranges for numeric parameters or valid options for string parameters.

Parameter tags provide additional information about how a parameter is used. Some tags indicate whether a parameter is required or intended for advanced configuration, while others may be used internally by OpenMS or workflow tools.

Parameters highlighted as required must be specified for the tool to run successfully. Parameters marked as advanced allow fine-tuning of algorithm behavior and are typically not needed for standard workflows.

+MapAlignerIdentificationCorrects retention time distortions between maps based on common peptide identifications.

version3.6.0-pre-nightly-2026-06-11 Version of the tool that generated this parameters file.

++1Instance '1' section for 'MapAlignerIdentification'

in[] Input files to align (all must have the same file type)input file*.featureXML, *.featureparquet, *.consensusXML, *.consensusparquet, *.idXML, *.idparquet, *.oms

out[] Output files (same file type as 'in'). This option or 'trafo_out' has to be provided; they can be used together.output file*.featureXML, *.featureparquet, *.consensusXML, *.consensusparquet, *.idXML, *.idparquet, *.oms

trafo_out[] Transformation output files. This option or 'out' has to be provided; they can be used together.output file*.trafoXML

in_spectra_files[] Optional input spectra files (mzML) that will be transformed along with the alignment. Size must match the number of input files.input file*.mzML

out_spectra_files[] Optional output spectra files (mzML) corresponding to transformed in_spectra_files. Size must match in_spectra_files.output file*.mzML

design Input file containing the experimental designinput file*.tsv

store_original_rtfalse Store the original retention times (before transformation) as meta data in the output?true, false

log Name of log file (created only when specified)

debug0 Sets the debug level

threads1 Sets the number of threads allowed to be used by the TOPP tool (0 = all available cores)

no_progressfalse Disables progress logging to command linetrue, false

forcefalse Overrides tool-specific checkstrue, false

testfalse Enables the test mode (needed for internal use only)true, false

+++referenceOptions to define a reference file (use either 'file' or 'index', not both)

file File to use as referenceinput file*.featureXML, *.featureparquet, *.consensusXML, *.consensusparquet, *.idXML, *.idparquet, *.oms

index0 Use one of the input files as reference ('1' for the first file, etc.).
If '0', no explicit reference is set - the algorithm will select a reference.0:∞

+++algorithmAlgorithm parameters section

score_type Name of the score type to use for ranking and filtering (.oms input only). If left empty, a score type is picked automatically.

score_cutofffalse Use only IDs above a score cut-off (parameter 'min_score') for alignment?true, false

min_score0.05 If 'score_cutoff' is 'true': Minimum score for an ID to be considered.
Unless you have very few runs or identifications, increase this value to focus on more informative peptides.

min_run_occur2 Minimum number of runs (incl. reference, if any) in which a peptide must occur to be used for the alignment.
Unless you have very few runs or identifications, increase this value to focus on more informative peptides.2:∞

max_rt_shift0.5 Maximum realistic RT difference for a peptide (median per run vs. reference). Peptides with higher shifts (outliers) are not used to compute the alignment.
If 0, no limit (disable filter); if > 1, the final value in seconds; if <= 1, taken as a fraction of the range of the reference RT scale.0.0:∞

use_unassigned_peptidestrue Should unassigned peptide identifications be used when computing an alignment of feature or consensus maps? If 'false', only peptide IDs assigned to features will be used.true, false

use_feature_rtfalse When aligning feature or consensus maps, don't use the retention time of a peptide identification directly; instead, use the retention time of the centroid of the feature (apex of the elution profile) that the peptide was matched to. If different identifications are matched to one feature, only the peptide closest to the centroid in RT is used.
Precludes 'use_unassigned_peptides'.true, false

use_adductstrue If IDs contain adducts, treat differently adducted variants of the same molecule as different.true, false

+++modelOptions to control the modeling of retention time transformations from data

typeb_spline Type of modellinear, b_spline, lowess, interpolated

++++linearParameters for 'linear' model

symmetric_regressionfalse Perform linear regression on 'y - x' vs. 'y + x', instead of on 'y' vs. 'x'.true, false

x_weightx Weight x values1/x, 1/x2, ln(x), x

y_weighty Weight y values1/y, 1/y2, ln(y), y

x_datum_min1.0e-15 Minimum x value

x_datum_max1.0e15 Maximum x value

y_datum_min1.0e-15 Minimum y value

y_datum_max1.0e15 Maximum y value

++++b_splineParameters for 'b_spline' model

wavelength0.0 Determines the amount of smoothing by setting the number of nodes for the B-spline. The number is chosen so that the spline approximates a low-pass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points.0.0:∞

num_nodes5 Number of nodes for B-spline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing.0:∞

extrapolatelinear Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the B-spline at the corresponding endpoint. 'b_spline': Use the B-spline (as for interpolation). 'constant': Use the constant value of the B-spline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range).linear, b_spline, constant, global_linear

boundary_condition2 Boundary condition at B-spline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero)0:2

++++lowessParameters for 'lowess' model

span0.666666666666667 Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit.0.0:1.0

auto_spanfalse If true, or if 'span' is 0, automatically select LOWESS span by cross-validation.true, false

auto_span_min0.15 Lower bound for auto-selected span.1.0e-03:∞

auto_span_max0.8 Upper bound for auto-selected span.-∞:0.99

auto_min_neighbors5 Minimum number of neighbors (span*n) enforced in auto mode.3:∞

auto_k_folds5 K-folds for CV when n>50 (else LOO is used).2:∞

auto_metricmae Metric for CV selection: one of {'p90','p95','p99','rmse','mae'}.p90, p95, p99, rmse, mae

auto_span_grid Optional explicit grid of span candidates in (0,1]. Comma-separated list, e.g. '0.2,0.3,0.5'. If empty, a default grid is used.

num_iterations3 Number of robustifying iterations for lowess fitting.0:∞

delta-1.0 Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.

interpolation_typecspline Method to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolationlinear, cspline, akima

extrapolation_typefour-point-linear Method to use for extrapolation outside the data range. 'two-point-linear': Uses a line through the first and last point to extrapolate. 'four-point-linear': Uses a line through the first and second point to extrapolate in front and and a line through the last and second-to-last point in the end. 'global-linear': Uses a linear regression to fit a line through all data points and use it for interpolation.two-point-linear, four-point-linear, global-linear

++++interpolatedParameters for 'interpolated' model

interpolation_typecspline Type of interpolation to apply.linear, cspline, akima

extrapolation_typetwo-point-linear Type of extrapolation to apply: two-point-linear: use the first and last data point to build a single linear model, four-point-linear: build two linear models on both ends using the first two / last two points, global-linear: use all points to build a single linear model. Note that global-linear may not be continuous at the border.two-point-linear, four-point-linear, global-linear