OpenMS
|
Computes confidence scores for OpenSwath results.
potential predecessor tools | → OpenSwathConfidenceScoring → | potential successor tools |
---|---|---|
OpenSwathAnalyzer | OpenSwathFeatureXMLToTSV |
This is an implementation of the SRM scoring algorithm described in:
Malmstroem, L.; Malmstroem, J.; Selevsek, N.; Rosenberger, G. & Aebersold, R.:
Automated workflow for large-scale selected reaction monitoring experiments.
J. Proteome Res., 2012, 11, 1644-1653
It has been adapted for the scoring of OpenSwath results.
The algorithm compares SRM/MRM features (peak groups) to assays and computes scores for the agreements. Every feature is compared not only to the "true" assay that was used to acquire the corresponding ion chromatograms, but also to a number (parameter decoys
) of unrelated - but real - assays selected at random from the assay library (parameter lib
). This serves to establish a background distribution of scores, against which the significance of the "true" score can be evaluated. The final confidence value of a feature is the local false discovery rate (FDR), calculated as the fraction of decoy assays that score higher than the "true" assay against the feature. In the output feature map, every feature is annotated with its local FDR in the meta value "local_FDR" (a "userParam" element in the featureXML), and its overall quality is set to "1 - local_FDR".
The agreement of a feature and an assay is assessed based on the difference in retention time (RT) and on the deviation of relative transition intensities. The score S is computed using a binomial generalized linear model (GLM) of the form:
\[ S = \frac{1}{1 + \exp(-(a + b \cdot \Delta_{RT}^2 + c \cdot d_{int}))} \]
The meanings of the model terms are as follows:
\( \Delta_{RT} \): Observed retention times are first mapped to the scale of the assays (parameter trafo
), then all RTs are scaled to the range 0 to 100 (based on the lowest/highest RT in the assay library). \( \Delta_{RT} \) is the absolute difference of the scaled RTs; note that this is squared in the scoring model.
\( d_{int} \): To compute the intensity distance, the n (advanced parameter transitions
) most intensive transitions of the feature are selected. For comparing against the "true" assay, the same transitions are considered; otherwise, the same number of most intensive transitions from the decoy assay. Transition intensities are scaled to a total of 1 per feature/assay and are ordered by the product (Q3) m/z value. Then the Manhattan distance of the intensity vectors is calculated (Malmstroem et al. used the RMSD instead, which has been replaced here to be independent of the number of transitions).
\( a, b, c \): Model coefficients, stored in the advanced parameters GLM:intercept
, GLM:delta_rt
, and GLM:dist_int
. The default values were estimated based on the training dataset used in the Malmstroem et al. study, reprocessed with the OpenSwath pipeline.
In addition to the local FDRs, the scores of features against their "true" assays are recorded in the output - in the meta value "GLM_score" of the respective feature.
The command line parameters of this tool are:
OpenSwathConfidenceScoring -- Compute confidence scores for OpenSwath results Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_OpenSwathConfidenceScoring.html Version: 3.3.0-pre-nightly-2024-11-20 Nov 21 2024, 02:34:56, Revision: decb5c8 To cite OpenMS: + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7. Usage: OpenSwathConfidenceScoring <options> Options (mandatory options marked with '*'): -in <file>* Input file (OpenSwath results) (valid formats: 'featureXML') -lib <file>* Assay library (valid formats: 'traML') -out <file>* Output file (results with confidence scores) (valid formats: 'featureXML') -trafo <file> Retention time transformation (valid formats: 'trafoXML') -decoys <number> Number of decoy assays to select from the library for every true assay (0 for "all") (default: '1000') (min: '0') -transitions <number> Number of transitions per feature to consider (highest intensities first; 0 for "all ") (default: '6') (min: '0') Common TOPP options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced)
INI file documentation of this tool: