Quality Control
Goal: Check the quality of the data (supports label-free workflows and IsobaricAnalyzer output).
The QualityControl TOPP tool computes and collects data which allows to compute QC metrics to check the quality of LC-MS data. Depending on the given input data this tool collects data for metrics (see section 'Metrics' below). New metavalues will be added to existing data and the information will be written out in mzTab format. This mzTab file can then be processed using your custom scripts or via the R package (see PTXQC).
Workflow
An example workflow can be found in OpenMS/share/OpenMS/examples/TOPPAS/QualityControl.toppas.
For data from IsobaricAnalyzer, just provide the consensusXML as input to QualityControl. No FeatureXMLs or TrafoXMLs are required. The mzML raw file can be added as input though.
Metrics
This Table shows what each of the included metrics does, what they need to be executed and what they add to the data or what they return.
Input Data
PostFDR FeatureXML: A FeatureXML after FDR filtering.
Contaminants Fasta file: A Fasta file containing contaminant proteins.
Raw mzML file: An unchanged mzML file.
InternalCalibration MzML file: An MzML file after internal calibration.
TrafoXML file: The RT alignment function as obtained from a MapAligner.
Contaminants | |
Required input data: Contaminants Fasta file, PostFDR FeatureXML | Output |
Description:
The Contaminants metric takes the contaminants database and digests the protein sequences with the digestion enzyme that is given in the featureXML. Afterwards it checks whether each of all peptide sequences of the featureXML (including the unassigned PeptideIdentifications) is registered in the contaminants database. | Changes in files:
Metavalue:
-
'is_contaminant' set to '1' or to '0' if the peptide is found in the contaminant database or not and sets a '0' if not.
Other outputs:
Returns:
-
Contaminant ratio of all peptides
-
Contaminant ratio of all assigned peptides
-
Contaminant ratio of all unassigned peptides
-
Intensity ratio of all contaminants in the assigned peptides
-
Number of empty features, Number of all found features
|
FragmentMassError | |
Required input data: PostFDR FeatureXML, raw mzML file | Output |
Description:
The FragmentMassError metric computes a list of fragment mass errors for each annotated MS2 spectrum in ppm and Da. Afterwards it calculates the mass delta between observed and theoretical peaks.
| Changes in files:
Metavalue:
-
'fragment_mass_error_ppm' set to the fragment mass error in parts per million
-
'fragment_mass_error_da' set to the fragment mass error in Dalton
Other Output:
Returns:
-
Average and variance of fragment mass errors in ppm
|
MissedCleavages | |
Required input data: PostFDR FeatureXML | Output |
Description:
This MissedCleavages metric counts the number of MissedCleavages per PeptideIdentification given a FeatureMap and returns an agglomeration statistic (observed counts). Additionally the first PeptideHit of each PeptideIdentification in the FeatureMap is augmented with metavalues.
| Changes in files:
Metavalue:
Other Output:
Returns:
-
Frequency map of missed cleavages as key/value pairs.
|
MS2IdentificationRate | |
Required input data: PostFDR FeatureXML, raw mzML file | Output |
Description:
The MS2IdentificationRate metric calculates the Rate of the MS2 identification as follows: The number of all PeptideIdentifications are counted and that number is divided by the total number of MS2 spectra.
| Changes in files:
This metric does not change anything in the data.
Other Output:
Returns:
-
Number of PeptideIdentifications
-
Number of MS2 spectra
-
Ratio of #pepID/#MS2
|
MzCalibration | |
Required input data: PostFDR FeatureXML
Optional input data: InternalCalibration MzML file | Output |
Description:
This metric adds new metavalues to the first (best) hit of each PeptideIdentification. For this metric it is also possible to use this without an MzML File, but then only uncalibrated m/z error (ppm) will be reported. However for full functionality a PeakMap/MSExperiment with original m/z-values before m/z calibration generated by InternalCalibration has to be given.
| Changes in files:
Metavalues:
-
'mz_raw'set to m/z value of original experiment
-
'mz_ref' set to m/z value of calculated reference
-
'uncalibrated_mz_error_ppm' set to uncalibrated m/z error in parts per million
-
'calibrated_mz_error_ppm' set to calibrated m/z error in parts per million
Other Output:
No additional output. |
RTAlignment | |
Required input data: PostFDR FeatureXML, trafoXML file | Output |
Description:
The RTAlignment metric checks what the retention time was before the alignment and how it is after the alignment. These two values are added to the metavalues in the PeptideIdentification.
| Changes in files:
Metavalues:
-
'rt_align' set to retention time after alignment
-
'rt_raw' set to retention time before alignment
Other Output:
No additional output. |
TIC | |
Required input data: raw mzML file | Output |
Description:
This TIC metric calculates the total ion count of an MSExperiment if a bin size in RT seconds greater than 0 is given. All MS1 abundances within a bin are summed up.
| Changes in files:
This metric does not change anything in the data.
Other Output:
Returns:
|
TopNoverRT | |
Required input data: PostFDR FeatureXML, raw mzML file | Output |
Description:
The TopNoverRT metric calculates the ScanEventNumber (number of the MS2 scans after the MS1 scan) and adds them as the new metavalue 'ScanEventNumber' to the PeptideIdentifications. It finds all unidentified MS2-Spectra and adds corresponding 'empty' PeptideIdentifications without sequence as placeholders to the unassigned PeptideIdentification list. Furthermore it adds the metavalue 'identified' to the PeptideIdentification.
| Changes in files:
Metavalues:
-
'ScanEventNumber' set to the calculated value
-
'identified' set to '+' or '-'
If provided:
-
'FWHM' set to RT peak width for all assigned PIs
-
'ion_injection_time' set to injection time from MS2 spectrum
-
'activation_method' set to activation method from MS2 spectrum
-
'total_ion_count' set to summed intensity from MS2 spectrum
-
'base_peak_intensity' set to highest intensity from MS2 spectrum
Additionally:
-
Adds empty PeptideIdentifications
Other Output:
No additional output. |