OpenMS
|
Complete workflow to run OpenSWATH
This implements the OpenSWATH workflow as described in Rost and Rosenberger et al. (Nature Biotechnology, 2014) and provides a complete, integrated analysis tool without the need to run multiple tools consecutively. See also http://openswath.org/ for additional documentation.
It executes the following steps in order, which is implemented in OpenSwathWorkflow:
See below or have a look at the INI file (via "OpenSwathWorkflow -write_ini myini.ini") for available parameters and more functionality.
SWATH maps can be provided as mzML files, either as single file directly from the machine (this assumes that the SWATH method has 1 MS1 and then n MS2 spectra which are ordered the same way for each cycle). E.g. a valid method would be MS1, MS2 [400-425], MS2 [425-450], MS1, MS2 [400-425], MS2 [425-450] while an invalid method would be MS1, MS2 [400-425], MS2 [425-450], MS1, MS2 [425-450], MS2 [400-425] where MS2 [xx-yy] indicates an MS2 scan with an isolation window starting at xx and ending at yy. OpenSwathWorkflow will try to read the SWATH windows from the data, if this is not possible please provide a tab-separated list with the correct windows using the -swath_windows_file parameter (this is recommended). Note that the software expects extraction windows (e.g. which peptides to extract from which window) which cannot have overlaps, otherwise peptides will be extracted from two different windows.
Alternatively, a set of split files (n+1 mzML files) can be provided, each containing one SWATH map (or MS1 map).
Since the file size can become rather large, it is recommended to not load the whole file into memory but rather cache it somewhere on the disk using a fast-access data format. This can be specified using the -readOptions cache parameter (this is recommended!).
The assay library (transition list) is provided through the -tr
parameter and can be in one of the following formats:
The current parameters are optimized for 2 hour gradients on SCIEX 5600 / 6600 TripleTOF instruments with a peak width of around 30 seconds using iRT peptides. If your chromatography differs, please consider adjusting -Scoring:TransitionGroupPicker:min_peak_width
to allow for smaller or larger peaks and adjust the -rt_extraction_window
to use a different extraction window for the retention time. In m/z domain, consider adjusting -mz_extraction_window
to your instrument resolution, which can be in Th or ppm.
Furthermore, if you wish to use MS1 information, use the -use_ms1_traces
flag and provide an MS1 map in addition to the SWATH data.
If you encounter issues with peak picking, try to disable peak filtering by setting -Scoring:TransitionGroupPicker:compute_peak_quality
false which will disable the filtering of peaks by chromatographic quality. Furthermore, you can adjust the smoothing parameters for the peak picking, by adjusting -Scoring:TransitionGroupPicker:PeakPickerMRM:sgolay_frame_length
or using a Gaussian smoothing based on your estimated peak width. Adjusting the signal to noise threshold will make the peaks wider or smaller.
The output of the OpenSwathWorkflow is a feature list, either as FeatureXML or as tsv (use -out_features
or -out_tsv
) while the latter is more memory friendly and can be directly used as input to other tools such as mProphet or pyProphet. If you analyze large datasets, it is recommended to only use -out_tsv
and not -out_features
. For downstream analysis (e.g. using mProphet or pyProphet) also the -out_tsv
format is recommended.
The feature list generated by -out_tsv
is a tab-separated file. It can be used directly as input to the mProphet or pyProphet (a Python re-implementation of mProphet) software tool, see Reiter et al (2011, Nature Methods).
In addition, the extracted chromatograms can be written out using the -out_chrom
parameter.
The tab-separated feature output contains the following information:
Header row | Format | Description |
transition_group_id | String | A unique id for the transition group (all chromatographic traces that are analyzed together) |
peptide_group_label | String | A unique id for the peptide group (will be the same for each charge state and heavy/light status) |
run_id | String | An identifier for the run (currently always 0) |
filename | String | The input filename |
RT | Float | Peak group retention time |
id | String | A unique identifier for the peak group |
Sequence | String | Peptide sequence (no modifications) |
MC | Int | Missed cleavages of the sequence (assuming Trypsin as protease) |
FullPeptideName | String | Full peptide sequence including modifications in Unimod format |
Charge | Int | Assumed charge state |
m/z | Float | Precursor m/z |
masserror_ppm | Float List | Pairs of fragment masses (m/z) and their associated error in ppm for all transitions |
Intensity | Float | Peak group intensity (sum of all transitions) |
ProteinName | String | Name of the associated protein |
decoy | String | Whether the transition is decoy or not (0 = false, 1 = true) |
assay_rt | Float | The expected RT in seconds (based on normalized iRT value) |
delta_rt | Float | The difference between the expected RT and the peak group RT in seconds |
leftWidth | Float | The start of the peak group (left side) in seconds |
main_var_xx_swath_prelim_score | Float | Initial score |
norm_RT | Float | The peak group retention time in normalized (iRT) space |
nr_peaks | Int | The number of transitions used |
peak_apices_sum | Float | The sum of all peak apices (may be used as alternative intensity) |
potentialOutlier | String | Potential outlier transitions (or "none" if none was detected) |
rightWidth | Float | The end of the peak group (left side) in seconds |
rt_score | Float | The raw RT score (unnormalized) |
sn_ratio | Float | The raw S/N ratio |
total_xic | Float | The total XIC of the chromatogram |
var_... | Float | One of multiple sub-scores used by OpenSWATH to describe the peak group |
aggr_prec_Peak_Area | String | Intensity (peak area) of MS1 traces separated by semicolon |
aggr_prec_Peak_Apex | String | Intensity (peak apex) of MS1 traces separated by semicolon |
aggr_prec_Fragment_Annotation | String | Annotation of MS1 traces separated by semicolon |
aggr_Peak_Area | String | Intensity (peak area) of fragment ion traces separated by semicolon |
aggr_Peak_Apex | String | Intensity (peak apex) of fragment ion traces separated by semicolon |
aggr_Fragment_Annotation | String | Annotation of fragment ion traces separated by semicolon |
The overall execution flow for this tool is implemented in OpenSwathWorkflow.
The command line parameters of this tool are:
File '/Users/builder/.OpenMS/OpenMS.ini' is deprecated. Updating missing/wrong entries in '/Users/builder/.OpenMS/OpenMS.ini' with defaults! OpenSwathWorkflow -- Complete workflow to run OpenSWATH Full documentation: http://www.openms.de/doxygen/release/3.0.0/html/UTILS_OpenSwathWorkflow.html Version: 3.0.0 Jul 14 2023, 11:57:33, Revision: be787e9 To cite OpenMS: + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959. Usage: OpenSwathWorkflow <options> This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript ion or use the --helphelp option Options (mandatory options marked with '*'): -in <files>* Input files separated by blank (valid formats: 'mzML', 'mzXML', 'sqMass ') -tr <file>* Transition file ('TraML','tsv','pqp') (valid formats: 'traML', 'tsv', 'pqp') -tr_type <type> Input file type -- default: determined from file extension or content (valid: 'traML', 'tsv', 'pqp') -tr_irt <file> Transition file ('TraML') (valid formats: 'traML', 'tsv', 'pqp') -tr_irt_nonlinear <file> Additional nonlinear transition file ('TraML') (valid formats: 'traML', 'tsv', 'pqp') -swath_windows_file <file> Optional, tab-separated file containing the SWATH windows for extractio n: lower_offset upper_offset. Note that the first line is a header and will be skipped. -out_features <file> Output file (valid formats: 'featureXML') -out_tsv <file> TSV output file (mProphet-compatible TSV file) (valid formats: 'tsv') -out_osw <file> OSW output file (PyProphet-compatible SQLite file) (valid formats: 'osw ') -sonar Data is scanning SWATH data -pasef Data is PASEF data -rt_extraction_window <double> Only extract RT around this value (-1 means extract over the whole rang e, a value of 600 means to extract around +/- 300 s of the expected elution). (default: '600.0') -ion_mobility_window <double> Extraction window in ion mobility dimension (in 1/k0 or milliseconds depending on library). This is the full window size, e.g. a value of 10 milliseconds would extract 5 milliseconds on either side. -1 means extract over the whole range or ion mobility is not present. (Default for diaPASEF data: 0.06 1/k0) (default: '-1.0') -mz_extraction_window <double> Extraction window in Thomson or ppm (see mz_extraction_window_unit) (default: '50.0') (min: '0.0') -mz_extraction_window_ms1 <double> Extraction window used in MS1 in Thomson or ppm (see mz_extraction_wind ow_ms1_unit) (default: '50.0') (min: '0.0') -im_extraction_window_ms1 <double> Extraction window in ion mobility dimension for MS1 (in 1/k0 or millise conds depending on library). -1 means this is not ion mobility data. (default: '-1.0') Debugging: -Debugging:irt_mzml <file> Chromatogram mzML containing the iRT peptides (valid formats: 'mzML') -Debugging:irt_trafo <file> Transformation file for RT transform (valid formats: 'trafoXML') Common UTIL options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (default : '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced) The following configuration subsections are valid: - Calibration Parameters for the m/z and ion mobility calibration. - Library Library parameters section - RTNormalization Parameters for the RTNormalization for iRT petides. This specifies how the RT alignment is performed and how outlier detection is applied. Outlier detection can be done iterati vely (by default) which removes one outlier per iteration or using the RANSAC algorithm. - Scoring Scoring parameters section You can write an example INI file using the '-write_ini' option. Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor. For more information, please consult the online documentation for this tool: - http://www.openms.de/doxygen/release/3.0.0/html/UTILS_OpenSwathWorkflow.html
INI file documentation of this tool: