Group corresponding features across labelfree experiments.

Group corresponding features across labelfree experiments. This tool produces results similar to those of FeatureLinkerUnlabeledQT, since it optimizes a similar objective. However, this algorithm is more efficient than FLQT as it uses a kd-tree for fast 2D region queries in m/z - RT space and a sorted binary search tree to choose the best cluster among the remaining ones in O(1). Insertion and searching in this tree have O(log n) runtime. KD-tree insertion and search have O(log n) runtime. The overall complexity of the algorithm is O(n log(n)) time and O(n) space.

In practice, the runtime of FeatureLinkerUnlabeledQT is often not significantly worse than that of FeatureLinkerUnlabeledKD if the datasets are relatively small and/or the value of the -nr_partitions parameter is chosen large enough. If, however, the datasets are very large, and especially if they are so dense that a partitioning based on the specified m/z tolerance is not possible anymore, then this algorithm becomes orders of magnitudes faster than FLQT.

Notably, this algorithm can be used to align featureXML files containing unassembled mass traces (as produced by MassTraceExtractor), which is often impossible for reasonably large datasets using other aligners, as these datasets tend to be too dense and hence cannot be partitioned.

Prior to feature linking, this tool performs an (optional) retention time transformation on the features using LOWESS regression in order to minimize retention time differences between corresponding features across different maps. These transformed RTs are used only internally. In the results, original RTs will be reported.

The linking behavior can be influenced by separately specifying how to use the available charge and adduct information. Options allow to restrict linking to features with the same adduct/charge (or lack thereof, i.e. features with charge zero or no adduct annotation), additionally allowing the linking of charged/adduct-annotated features with those having no charge/adduct information, or allowing all features to be linked irrespective of charge state/adduct information.

Note that the more relaxed the allowed grouping criteria, the larger internally used connected components memory-wise. More stringent m/z or retention time tolerances might be required then.

The command line parameters of this tool are:

FeatureLinkerUnlabeledKD -- Groups corresponding features from multiple maps.
Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_FeatureLinkerUnlabeledKD.html
Version: 3.6.0-pre-nightly-2026-06-11 Jun 11 2026, 09:57:53, Revision: fc66052
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.

Usage:
  FeatureLinkerUnlabeledKD <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option

Options (mandatory options marked with '*'):
  -in <files>*        Input files separated by blanks (valid formats: 'featureXML', 'consensusXML', 'featurep
                      arquet', 'consensusparquet')
  -out <file>*        Output file (valid formats: 'consensusXML', 'consensusparquet')
  -design <file>      Input file containing the experimental design (valid formats: 'tsv')
                      
  -keep_subelements   For consensusXML/consensusparquet input only: If set, the sub-features of the inputs 
                      are transferred to the output.
                      
Common TOPP options:
  -ini <file>         Use the given TOPP INI file
  -threads <n>        Sets the number of threads allowed to be used by the TOPP tool (0 = all available cores
                      ) (default: '1')
  -write_ini <file>   Writes the default configuration file
  --help              Shows options
  --helphelp          Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Algorithm parameters section

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
For more information, please consult the online documentation for this tool:
  - http://www.openms.de/doxygen/nightly/html/TOPP_FeatureLinkerUnlabeledKD.html

INI file documentation of this tool:

Legend:

required parameter

advanced parameter

This section lists all parameters supported by the tool. Parameters are organized into hierarchical subsections that group related settings together. Subsections may contain further subsections or individual parameters.

Each parameter entry contains the following information:

Name The identifier used in configuration files and on the command line.
Default value The value used if the parameter is not explicitly specified.
Description A short explanation describing the purpose and behavior of the parameter.
Tags Additional metadata associated with the parameter.
Restrictions Allowed value ranges for numeric parameters or valid options for string parameters.

Parameter tags provide additional information about how a parameter is used. Some tags indicate whether a parameter is required or intended for advanced configuration, while others may be used internally by OpenMS or workflow tools.

Parameters highlighted as required must be specified for the tool to run successfully. Parameters marked as advanced allow fine-tuning of algorithm behavior and are typically not needed for standard workflows.

+FeatureLinkerUnlabeledKDGroups corresponding features from multiple maps.

version3.6.0-pre-nightly-2026-06-11 Version of the tool that generated this parameters file.

++1Instance '1' section for 'FeatureLinkerUnlabeledKD'

in[] input files separated by blanksinput file*.featureXML, *.consensusXML, *.featureparquet, *.consensusparquet

out Output fileoutput file*.consensusXML, *.consensusparquet

design input file containing the experimental designinput file*.tsv

keep_subelementsfalse For consensusXML/consensusparquet input only: If set, the sub-features of the inputs are transferred to the output.true, false

log Name of log file (created only when specified)

debug0 Sets the debug level

threads1 Sets the number of threads allowed to be used by the TOPP tool (0 = all available cores)

no_progressfalse Disables progress logging to command linetrue, false

forcefalse Overrides tool-specific checkstrue, false

testfalse Enables the test mode (needed for internal use only)true, false

+++algorithmAlgorithm parameters section

mz_unitppm Unit of m/z toleranceppm, Da

nr_partitions100 Number of partitions in m/z space1:∞

++++warp

enabledtrue Whether or not to internally warp feature RTs using LOWESS transformation before linking (reported RTs in results will always be the original RTs)true, false

rt_tol100.0 Width of RT tolerance window (sec)0.0:∞

mz_tol5.0 m/z tolerance (in ppm or Da)0.0:∞

max_pairwise_log_fc0.5 Maximum absolute log10 fold change between two compatible signals during compatibility graph construction. Two signals from different maps will not be connected by an edge in the compatibility graph if absolute log fold change exceeds this limit (they might still end up in the same connected component, however). Note: this does not limit fold changes in the linking stage, only during RT alignment, where we try to find high-quality alignment anchor points. Setting this to a value < 0 disables the FC check.

min_rel_cc_size0.5 Only connected components containing compatible features from at least max(2, (warp_min_occur * number_of_input_maps)) input maps are considered for computing the warping function0.0:1.0

max_nr_conflicts0 Allow up to this many conflicts (features from the same map) per connected component to be used for alignment (-1 means allow any number of conflicts)-1:∞

++++link

rt_tol30.0 Width of RT tolerance window (sec)0.0:∞

mz_tol10.0 m/z tolerance (in ppm or Da)0.0:∞

charge_mergingWith_charge_zero whether to disallow charge mismatches (Identical), allow to link charge zero (i.e., unknown charge state) with every charge state, or disregard charges (Any).Identical, With_charge_zero, Any

adduct_mergingAny whether to only allow the same adduct for linking (Identical), also allow linking features with adduct-free ones, or disregard adducts (Any).Identical, With_unknown_adducts, Any

++++distance_RTDistance component based on RT differences

exponent1.0 Normalized RT differences ([0-1], relative to 'max_difference') are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0.0:∞

weight1.0 Final RT distances are weighted by this factor0.0:∞

++++distance_MZDistance component based on m/z differences

exponent2.0 Normalized ([0-1], relative to 'max_difference') m/z differences are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0.0:∞

weight1.0 Final m/z distances are weighted by this factor0.0:∞

++++distance_intensityDistance component based on differences in relative intensity (usually relative to highest peak in the whole data set)

exponent1.0 Differences in relative intensity ([0-1]) are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0.0:∞

weight1.0 Final intensity distances are weighted by this factor0.0:∞

log_transformenabled Log-transform intensities? If disabled, d = |int_f2 - int_f1| / int_max. If enabled, d = |log(int_f2 + 1) - log(int_f1 + 1)| / log(int_max + 1))enabled, disabled

++++LOWESSLOWESS parameters for internal RT transformations (only relevant if 'warp:enabled' is set to 'true')

span0.666666666666667 Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit.0.0:1.0

auto_spanfalse If true, or if 'span' is 0, automatically select LOWESS span by cross-validation.true, false

auto_span_min0.15 Lower bound for auto-selected span.1.0e-03:∞

auto_span_max0.8 Upper bound for auto-selected span.-∞:0.99

auto_min_neighbors5 Minimum number of neighbors (span*n) enforced in auto mode.3:∞

auto_k_folds5 K-folds for CV when n>50 (else LOO is used).2:∞

auto_metricmae Metric for CV selection: one of {'p90','p95','p99','rmse','mae'}.p90, p95, p99, rmse, mae

auto_span_grid Optional explicit grid of span candidates in (0,1]. Comma-separated list, e.g. '0.2,0.3,0.5'. If empty, a default grid is used.

num_iterations3 Number of robustifying iterations for lowess fitting.0:∞

delta-1.0 Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.

interpolation_typecspline Method to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolationlinear, cspline, akima

extrapolation_typefour-point-linear Method to use for extrapolation outside the data range. 'two-point-linear': Uses a line through the first and last point to extrapolate. 'four-point-linear': Uses a line through the first and second point to extrapolate in front and and a line through the last and second-to-last point in the end. 'global-linear': Uses a linear regression to fit a line through all data points and use it for interpolation.two-point-linear, four-point-linear, global-linear