OpenMS
FeatureFinderMultiplex

Detects peptide pairs in LC-MS data and determines their relative abundance.

pot. predecessor tools → FeatureFinderMultiplex → pot. successor tools
FileConverter IDMapper
FileFilter

FeatureFinderMultiplex is a tool for the fully automated analysis of quantitative proteomics data. It detects pairs of isotopic envelopes with fixed m/z separation. It requires no prior sequence identification of the peptides. In what follows we outline the algorithm. Algorithm The algorithm is divided into three parts: filtering, clustering and linear fitting, see Fig. (d), (e) and (f). In the following discussion let us consider a particular mass spectrum at retention time 1350 s, see Fig. (a). It contains a peptide of mass 1492 Da and its 6 Da heavier labelled counterpart. Both are doubly charged in this instance. Their isotopic envelopes therefore appear at 746 and 749 in the spectrum. The isotopic peaks within each envelope are separated by 0.5. The spectrum was recorded at finite intervals. In order to read accurate intensities at arbitrary m/z we spline-fit over the data, see Fig. (b). We would like to search for such peptide pairs in our LC-MS data set. As a warm-up let us consider a standard intensity cut-off filter, see Fig. (c). Scanning through the entire m/z range (red dot) only data points with intensities above a certain threshold pass the filter. Unlike such a local filter, the filter used in our algorithm takes intensities at a range of m/z positions into account, see Fig. (d). A data point (red dot) passes if

  • all six intensities at m/z, m/z+0.5, m/z+1, m/z+3, m/z+3.5 and m/z+4 lie above a certain threshold,
  • the intensity profiles in neighbourhoods around all six m/z positions show a good correlation and
  • the relative intensity ratios within a peptide agree up to a factor with the ratios of a theoretic averagine model. Let us now filter not only a single spectrum but all spectra in our data set. Data points that pass the filter form clusters in the t-m/z plane, see Fig. (e). Each cluster corresponds to the mono-isotopic mass trace of the lightest peptide of a SILAC pattern. We now use hierarchical clustering methods to assign each data point to a specific cluster. The optimum number of clusters is determined by maximizing the silhouette width of the partitioning. Each data point in a cluster corresponds to three pairs of intensities (at [m/z, m/z+3], [m/z+0.5, m/z+3.5] and [m/z+1, m/z+4]). A plot of all intensity pairs in a cluster shows a clear linear correlation, see Fig. (f). Using linear regression we can determine the relative amounts of labelled and unlabelled peptides in the sample.
    The command line parameters of this tool are:
    FeatureFinderMultiplex -- Determination of peak ratios in LC-MS data
    Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_FeatureFinderMultiplex.html
    Version: 3.2.0 Nov 18 2024, 16:14:00, Revision: 03223c3
    To cite OpenMS:
     + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
       trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.
    
    Usage:
      FeatureFinderMultiplex <options>
    
    Options (mandatory options marked with '*'):
      -in <file>*                              LC-MS dataset in either centroid or profile mode (valid formats: 
                                               'mzML')
      -out <file>                              Output file containing the individual peptide features. (valid 
                                               formats: 'featureXML')
    
    algorithmic parameters:
      -algorithm:labels <text>                 Labels used for labelling the samples. If the sample is unlabelled
                                                (i.e. you want to detect only single peptide features) please 
                                               leave this parameter empty. [...] specifies the labels for a singl
                                               e sample. For example
                                               
                                               [][Lys8,Arg10]        ... SILAC
                                               [][Lys4,Arg6][Lys8,Arg10]        ... triple-SILAC
                                               [Dimethyl0][Dimethyl6]        ... Dimethyl
                                               [Dimethyl0][Dimethyl4][Dimethyl8]        ... triple Dimethyl
                                               [ICPL0][ICPL4][ICPL6][ICPL10]        ... ICPL (default: '[][Lys8,A
                                               rg10]')
      -algorithm:charge <text>                 Range of charge states in the sample, i.e. min charge : max charge
                                               . (default: '1:4')
      -algorithm:rt_typical <value>            Typical retention time [s] over which a characteristic peptide 
                                               elutes. (This is not an upper bound. Peptides that elute for longe
                                               r will be reported.) (default: '40.0') (min: '0.0')
      -algorithm:rt_band <value>               The algorithm searches for characteristic isotopic peak patterns, 
                                               spectrum by spectrum. For some low-intensity peptides, an importan
                                               t peak might be missing in one spectrum but be present in one of 
                                               the neighbouring ones. The algorithm takes a bundle of neighbourin
                                               g spectra with width rt_band into account. For example with rt_ban
                                               d = 0, all characteristic isotopic peaks have to be present in 
                                               one and the same spectrum. As rt_band increases, the sensitivity 
                                               of the algorithm but also the likelihood of false detections incre
                                               ases. (default: '0.0') (min: '0.0')
      -algorithm:rt_min <value>                Lower bound for the retention time [s]. (Any peptides seen for a 
                                               shorter time period are not reported.) (default: '2.0') (min: '0.0
                                               ')
      -algorithm:mz_tolerance <value>          M/z tolerance for search of peak patterns. (default: '6.0') (min: 
                                               '0.0')
      -algorithm:mz_unit <choice>              Unit of the 'mz_tolerance' parameter. (default: 'ppm') (valid: 
                                               'Da', 'ppm')
      -algorithm:intensity_cutoff <value>      Lower bound for the intensity of isotopic peaks. (default: '1000.0
                                               ') (min: '0.0')
      -algorithm:peptide_similarity <value>    Two peptides in a multiplet are expected to have the same isotopic
                                                pattern. This parameter is a lower bound on their similarity. 
                                               (default: '0.5') (min: '-1.0' max: '1.0')
      -algorithm:averagine_similarity <value>  The isotopic pattern of a peptide should resemble the averagine 
                                               model at this m/z position. This parameter is a lower bound on 
                                               similarity between measured isotopic pattern and the averagine 
                                               model. (default: '0.4') (min: '-1.0' max: '1.0')
      -algorithm:missed_cleavages <number>     Maximum number of missed cleavages due to incomplete digestion. 
                                               (Only relevant if enzymatic cutting site coincides with labelling 
                                               site. For example, Arg/Lys in the case of trypsin digestion and 
                                               SILAC labelling.) (default: '0') (min: '0')
    
                                               
    Common TOPP options:
      -ini <file>                              Use the given TOPP INI file
      -threads <n>                             Sets the number of threads allowed to be used by the TOPP tool 
                                               (default: '1')
      -write_ini <file>                        Writes the default configuration file
      --help                                   Shows options
      --helphelp                               Shows all options (including advanced)
    
    
    INI file documentation of this tool:
    Legend:
    required parameter
    advanced parameter
    +FeatureFinderMultiplexDetermination of peak ratios in LC-MS data
    version3.2.0 Version of the tool that generated this parameters file.
    ++1Instance '1' section for 'FeatureFinderMultiplex'
    in LC-MS dataset in either centroid or profile modeinput file*.mzML
    out Output file containing the individual peptide features.output file*.featureXML
    out_multiplets Optional output file containing all detected peptide groups (i.e. peptide pairs or triplets or singlets or ..). The m/z-RT positions correspond to the lightest peptide in each group.output file*.consensusXML
    out_blacklist Optional output file containing all peaks which have been associated with a peptide feature (and subsequently blacklisted).output file*.mzML
    log Name of log file (created only when specified)
    debug0 Sets the debug level
    threads1 Sets the number of threads allowed to be used by the TOPP tool
    no_progressfalse Disables progress logging to command linetrue, false
    forcefalse Overrides tool-specific checkstrue, false
    testfalse Enables the test mode (needed for internal use only)true, false
    +++algorithmalgorithmic parameters
    labels[][Lys8,Arg10] Labels used for labelling the samples. If the sample is unlabelled (i.e. you want to detect only single peptide features) please leave this parameter empty. [...] specifies the labels for a single sample. For example

    [][Lys8,Arg10] ... SILAC
    [][Lys4,Arg6][Lys8,Arg10] ... triple-SILAC
    [Dimethyl0][Dimethyl6] ... Dimethyl
    [Dimethyl0][Dimethyl4][Dimethyl8] ... triple Dimethyl
    [ICPL0][ICPL4][ICPL6][ICPL10] ... ICPL
    charge1:4 Range of charge states in the sample, i.e. min charge : max charge.
    isotopes_per_peptide3:6 Range of isotopes per peptide in the sample. For example 3:6, if isotopic peptide patterns in the sample consist of either three, four, five or six isotopic peaks.
    rt_typical40.0 Typical retention time [s] over which a characteristic peptide elutes. (This is not an upper bound. Peptides that elute for longer will be reported.)0.0:∞
    rt_band0.0 The algorithm searches for characteristic isotopic peak patterns, spectrum by spectrum. For some low-intensity peptides, an important peak might be missing in one spectrum but be present in one of the neighbouring ones. The algorithm takes a bundle of neighbouring spectra with width rt_band into account. For example with rt_band = 0, all characteristic isotopic peaks have to be present in one and the same spectrum. As rt_band increases, the sensitivity of the algorithm but also the likelihood of false detections increases.0.0:∞
    rt_min2.0 Lower bound for the retention time [s]. (Any peptides seen for a shorter time period are not reported.)0.0:∞
    mz_tolerance6.0 m/z tolerance for search of peak patterns.0.0:∞
    mz_unitppm Unit of the 'mz_tolerance' parameter.Da, ppm
    intensity_cutoff1000.0 Lower bound for the intensity of isotopic peaks.0.0:∞
    peptide_similarity0.5 Two peptides in a multiplet are expected to have the same isotopic pattern. This parameter is a lower bound on their similarity.-1.0:1.0
    averagine_similarity0.4 The isotopic pattern of a peptide should resemble the averagine model at this m/z position. This parameter is a lower bound on similarity between measured isotopic pattern and the averagine model.-1.0:1.0
    averagine_similarity_scaling0.95 Let x denote this scaling factor, and p the averagine similarity parameter. For the detection of single peptides, the averagine parameter p is replaced by p' = p + x(1-p), i.e. x = 0 -> p' = p and x = 1 -> p' = 1. (For knock_out = true, peptide doublets and singlets are detected simultaneously. For singlets, the peptide similarity filter is irreleavant. In order to compensate for this 'missing filter', the averagine parameter p is replaced by the more restrictive p' when searching for singlets.)0.0:1.0
    missed_cleavages0 Maximum number of missed cleavages due to incomplete digestion. (Only relevant if enzymatic cutting site coincides with labelling site. For example, Arg/Lys in the case of trypsin digestion and SILAC labelling.)0:∞
    spectrum_typeautomatic Type of MS1 spectra in input mzML file. 'automatic' determines the spectrum type directly from the input mzML file.profile, centroid, automatic
    averagine_typepeptide The type of averagine to use, currently RNA, DNA or peptidepeptide, RNA, DNA
    knock_outfalse Is it likely that knock-outs are present? (Supported for doublex, triplex and quadruplex experiments only.)true, false
    +++labelsmass shifts for all possible labels
    Arg66.0201290268 Label:13C(6) | C(-6) 13C(6) | unimod #1880.0:∞
    Arg1010.008268599999999 Label:13C(6)15N(4) | C(-6) 13C(6) N(-4) 15N(4) | unimod #2670.0:∞
    Lys44.0251069836 Label:2H(4) | H(-4) 2H(4) | unimod #4810.0:∞
    Lys66.0201290268 Label:13C(6) | C(-6) 13C(6) | unimod #1880.0:∞
    Lys88.0141988132 Label:13C(6)15N(2) | C(-6) 13C(6) N(-2) 15N(2) | unimod #2590.0:∞
    Leu33.01883 Label:2H(3) | H(-3) 2H(3) | unimod #2620.0:∞
    Dimethyl028.031300000000002 Dimethyl | H(4) C(2) | unimod #360.0:∞
    Dimethyl432.056407 Dimethyl:2H(4) | 2H(4) C(2) | unimod #1990.0:∞
    Dimethyl634.063116999999998 Dimethyl:2H(4)13C(2) | 2H(4) 13C(2) | unimod #5100.0:∞
    Dimethyl836.075670000000002 Dimethyl:2H(6)13C(2) | H(-2) 2H(6) 13C(2) | unimod #3300.0:∞
    ICPL0105.021463999999995 ICPL | H(3) C(6) N O | unimod #3650.0:∞
    ICPL4109.046571 ICPL:2H(4) | H(-1) 2H(4) C(6) N O | unimod #6870.0:∞
    ICPL6111.041593000000006 ICPL:13C(6) | H(3) 13C(6) N O | unimod #3640.0:∞
    ICPL10115.066699999999997 ICPL:13C(6)2H(4) | H(-1) 2H(4) 13C(6) N O | unimod #8660.0:∞