OpenMS
|
Performs an internal mass recalibration on an MS experiment.
pot. predecessor tools | → InternalCalibration → | pot. successor tools |
---|---|---|
PeakPickerHiRes | any tool operating on MS peak data (in mzML format) | |
FeatureFinderCentroided |
Given reference masses (as either peptide identifications or as list of fixed masses) an MS experiment can be recalibrated using a linear or quadratic regression fitted to the observed vs. the theoretical masses.
Chose one of two optional input files: 1) peptide identifications (from featureXML or idXML) using 'id_in' 2) lock masses using 'lock_in'
The user can choose whether the calibration function shall be calculated for each spectrum separately or once for the whole map. If this is done scan-wise, a user-defined range of neighboring spectra is searched for lock masses/peptide IDs. They are used to build a model, which is applied to the spectrum at hand. The RT range ('RT_chunking') should be small enough to resolve time-dependent change of decalibration, but wide enough to have enough calibrant masses for a stable model. A linear model requires at least two calibrants, a quadradic at least three. Usually, the RT range should provide about 3x more calibrants than required, i.e. 6(=3x2) for linear, and 9(=3x3) for quadratic models. If the calibrant data is too sparse for a certain scan, the closest neighboring model will be used automatically. If no model can be calculated anywhere, the tool will fail.
Optional quality control output files allow to judge the success of calibration. It is strongly advised to inspect them. If PNG images are requested, 'R' (statistical programming language) needs to be installed and available on the system path!
Outlier detection is supported using the RANSAC algorithm. However, usually it's better to provide high-confidence calibrants instead of relying on automatic removal of outliers.
Post calibration statistics (median ppm and median-absolute-deviation) are automatically computed. The calibration is deemed successful if the statistics are within certain bounds ('goodness:XXX').
Detailed description for each calibration method: 1) [id_in] The peptide identifications should be derived from the very same mzML file using a wide precursor window (e.g. 25 ppm), which captures the possible decalibration. Subsequently, the IDs should be filtered for high confidence (e.g. low FDR, ideally FDR=0.0) and given as input to this tool. Remaining outliers can be removed by using RANSAC. The data might benefit from a precursor mass correction (e.g. using HighResPrecursorMassCorrector), before an MS/MS search is done. The list of calibrants is derived solely from the idXML/featureXML and only the resulting model is applied to the mzML.
2) [lock_in] Calibration can be performed using specific lock masses which occur in most spectra. The structure of the cal:lock_in CSV file is as follows: Each line represents one lock mass in the format: <m/z>, <ms-level>, <charge> Lines starting with # are treated as comments and ignored. The ms-level is usually '1', but you can also use '2' if there are fragment ions commonly occurring.
Example:
Additional filters ('cal:lock_require_mono', 'cal:lock_require_iso') allow to exclude spurious false-positive calibrant peaks. These filters require knowledge of the charge state, thus charge needs to be specified in the input CSV. Detailed information on which lock masses passed these filters are available when -debug is used (any level).
The calibration function will use all lock masses (i.e. from all ms-levels) within the defined RT range to calibrate a spectrum. Thus, care should be taken that spectra from ms-levels specified here, are recorded using the same mass analyzer (MA). This is no issue for a Q-Exactive (which only has one MA), but depends on the acquisition scheme for instruments with two/three MAs (e.g. for Orbitrap Velos, MS/MS spectra are commonly acquired in the ion trap and should not be used during calibration of MS1).
General remarks: The user can select what MS levels are subjected to calibration. Calibration must be done once for each mass analyzer. Usually, peptide ID's provide calibration points for MS1 precursors, i.e. are suitable for MS1. They are applicable for MS2 only if the same mass analyzer was used (e.g. Q-Exactive). In other words, MS/MS spectra acquired using the ion trap analyzer of a Velos cannot be calibrated using peptide ID's. Precursor m/z associated to higher-level MS spectra are corrected if their precursor spectra are subject to calibration, e.g. precursor information within MS2 spectra is calibrated if target ms-level is set to 1. Lock masses ('cal:lock_in') can be specified freely for MS1 and/or MS2.
The command line parameters of this tool are:
InternalCalibration -- Applies an internal mass recalibration. Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_InternalCalibration.html Version: 3.3.0-pre-nightly-2024-11-20 Nov 21 2024, 02:34:56, Revision: decb5c8 To cite OpenMS: + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7. Usage: InternalCalibration <options> Options (mandatory options marked with '*'): -in <file>* Input peak file (valid formats: 'mzML') -out <file>* Output file (valid formats: 'mzML') -rscript_executable <file> Path to the Rscript executable (default: 'Rscript'). -ppm_match_tolerance <delta m/z in [ppm]> Finding calibrants in raw data uses this tolerance (for lock masses and ID's). (default: '25.0') Chose one of two optional input files ('id_in' or 'lock_in') to define the calibration masses/function: -cal:id_in <file> Identifications or features whose peptide ID's serve as calibrat ion masses. (valid formats: 'idXML', 'featureXML') -cal:lock_in <file> Input file containing reference m/z values (text file with each line as: m/z ms-level charge) which occur in all scans. (valid formats: 'csv') -cal:lock_out <file> Optional output file containing peaks from 'in' which were match ed to reference m/z values. Useful to see which peaks were used for calibration. (valid formats: 'mzML') -cal:lock_fail_out <file> Optional output file containing lock masses which were NOT found or accepted(!) in data from 'in'. Useful to see which peaks were used for calibration. (valid formats: 'mzML') -cal:lock_require_mono Require all lock masses to be monoisotopic, i.e. not the iso1, iso2 etc ('charge' column is used to determine the spacing). Peaks which are not mono-isotopic are not used. -cal:lock_require_iso Require all lock masses to have at least the +1 isotope. Peaks without isotope pattern are not used. -cal:model_type <model> Type of function to be fitted to the calibration points. (defaul t: 'linear_weighted') (valid: 'linear', 'linear_weighted', 'quad ratic', 'quadratic_weighted') -ms_level i j ... Target MS levels to apply the transformation onto. Does not affe ct calibrant collection. (default: '[1 2 3]') -RT_chunking <RT window in [sec]> RT window (one-sided, i.e. left->center, or center->right) aroun d an MS scan in which calibrants are collected to build a model. Set to -1 to use ALL calibrants for all scans, i.e. a global model. (default: '300.0') Robust outlier removal using RANSAC: -RANSAC:enabled Apply RANSAC to calibration points to remove outliers before fitting a model. -RANSAC:threshold <threshold> Threshold for accepting inliers (instrument precision (not accur acy!) as ppm^2 distance) (default: '10.0') -RANSAC:pc_inliers <# inliers> Minimum percentage (of available data) of inliers (<threshold away from model) to accept the model. (default: '30') (min: '1' max: '99') -RANSAC:iter <# iterations> Maximal # iterations. (default: '70') Thresholds for accepting calibration success: -goodness:median <threshold> The median ppm error of calibrated masses must be smaller than this threshold. (default: '4.0') -goodness:MAD <threshold> The median absolute deviation of the ppm error of calibrated masses must be smaller than this threshold. (default: '2.0') Tables and plots to verify calibration performance: -quality_control:models <table> Table of model parameters for each spectrum. (valid formats: 'csv') -quality_control:models_plot <image> Plot image of model parameters for each spectrum. (valid formats : 'png') -quality_control:residuals <table> Table of pre- and post calibration errors. (valid formats: 'csv' ) -quality_control:residuals_plot <image> Plot image of pre- and post calibration errors. (valid formats: 'png') Common TOPP options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced)
INI file documentation of this tool: