Export MS/MS data in .MGF format for GNPS (http://gnps.ucsd.edu). GNPS (Global Natural Products Social Molecular Networking, http://gnps.ucsd.edu) is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. The GNPS web-platform makes it possible to perform spectral library search against public MS/MS spectral libraries, as well as to perform various data analysis such as MS/MS molecular networking, network annotation propagation, and the Dereplicator-based annotation. The GNPS manuscript is available here: https://www.nature.com/articles/nbt.3597 This tool was developed for the Feature Based Molecular Networking (FBMN) workflow (https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/)
### Please cite: Nothias, L.-F., Petras, D., Schmid, R. et al. [Feature-based molecular networking in the GNPS analysis environment](https://www.nature.com/articles/s41592-020-0933-6). Nat. Methods 17, 905–908 (2020).
In brief, after running an OpenMS metabolomics pipeline, the **GNPSExport**, together with the **TextExporter** TOPP tool, can be used on the consensusXML file and the mzML files to generate the files needed for FBMN. Those files are:
- A **MS/MS spectral data file** (.MGF format) which is generated with the GNPSExport util.
- A **feature quantification table** (.TXT format) which is generated with the TextExport util.
A representative OpenMS-GNPS workflow would use the following OpenMS TOPP tools sequentially:
- Input mzML files
- Run the FeatureFinderMetabo tool on the mzML files.
- Run the MapAlignerPoseClustering tool on the featureXML files. `MapAlignerPoseClustering -in FFM_inputFile0.featureXML FFM_inputFile1.featureXML -out MapAlignerPoseClustering_inputFile0.featureXML MapAlignerPoseClustering_inputFile1.featureXML`
- Run the IDMapper tool on the featureXML and mzML files. `IDMapper -id emptyfile.idXML -in MapAlignerPoseClustering_inputFile0.featureXML -spectra:in MapAlignerPoseClustering_inputFile0.mzML -out IDMapper_inputFile0.featureXML` `IDMapper -id emptyfile.idXML -in MapAlignerPoseClustering_inputFile1.featureXML -spectra:in MapAlignerPoseClustering_inputFile1.mzML -out IDMapper_inputFile1.featureXML`
- Run the *MetaboliteAdductDecharger* tool on the featureXML files.
- Run the FeatureLinkerUnlabeledKD tool or FeatureLinkerUnlabeledQT, on the featureXML files and output a consensusXML file. `FeatureLinkerUnlabeledKD -in IDMapper_inputFile0.featureXML IDMapper_inputFile1.featureXML -out FeatureLinkerUnlabeledKD.consensusXML`
- Run the FileFilter on the consensusXML file to keep only consensusElements with at least MS/MS scan (peptide identification). `FileFilter -id:remove_unannotated_features -in FeatureLinkerUnlabeledKD.consensusXML -out FileFilter.consensusXML`
- Run the GNPSExport on the "filtered consensusXML file" to export an .MGF file. For each consensusElement in the consensusXML file, the GNPSExport command produces one representative consensus MS/MS spectrum (named peptide annotation in OpenMS jargon) which is appended in the MS/MS spectral file (.MGF file). (Note that the parameters for the spectral file generation are defined in the GNPSExport INI parameters file, available here: https://ccms-ucsd.github.io/GNPSDocumentation/openms_gnpsexport/GNPSExport.ini `GNPSExport -ini iniFile-GNPSExport.ini -in_cm filtered.consensusXML -in_mzml inputFile0.mzML inputFile1.mzML -out GNPSExport_output.mgf`
- Run the TextExporter on the "filtered consensusXML file" to export a .TXT file. `TextExporter -in FileFilter.consensusXML -out FeatureQuantificationTable.txt`
- Upload your files to GNPS and run the Feature-Based Molecular Networking workflow. Instructions can be found here: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/
The GitHub page for the ProteoSAFe workflow and the OpenMS python wrappers is available here: https://github.com/Bioinformatic-squad-DorresteinLab/openms-gnps-workflow An online version of the OpenMS-GNPS pipeline for FBMN running on CCMS server (http://proteomics.ucsd.edu/) is available here: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-openms/ The command line parameters of this tool are:
GNPSExport -- Tool to export representative consensus MS/MS scan per consensusElement into a .MGF file format
.
See the documentation on https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking_with_
openms
Full documentation: http://www.openms.de/doxygen/release/2.8.0/html/TOPP_GNPSExport.html
Version: 2.8.0 Feb 22 2022, 11:52:07, Revision: d203985
To cite OpenMS:
Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.
To cite GNPSExport:
Nothias L.F. et al.. Feature-based Molecular Networking in the GNPS Analysis Environment. bioRxiv 812404 (2019). doi:10.1101/812404.
Usage:
GNPSExport <options>
Options (mandatory options marked with '*'):
-in_cm <file>* Input consensusXML file containing only consensusElements with "pep
tide" annotations. (valid formats: 'consensusXML')
-in_mzml <files>* Original mzml files containing the ms2 spectra (aka peptide annotat
ion).
Must be in order that the consensusXML file maps the original mzML
files. (valid formats: 'mzML')
-out <file>* Output MGF file (valid formats: 'mgf')
-output_type <choice> Specificity of mgf output information (default: 'most_intense' vali
d: 'merged_spectra', 'most_intense')
-peptide_cutoff <number> Number of most intense peptides to consider per consensus element;
'-1' to consider all identifications. (default: '5' min: '-1')
-ms2_bin_size <value> Bin size (Da) for fragment ions when merging ms2 scans. (default:
'0.019999999552965' min: '0.0')
Options for exporting mgf file with merged spectra per consensusElement:
-merged_spectra:cos_similarity <value> Cosine similarity threshold for merged_spectra output. (default:
'0.9' min: '0.0')
Common TOPP options:
-ini <file> Use the given TOPP INI file
-threads <n> Sets the number of threads allowed to be used by the TOPP tool (def
ault: '1')
-write_ini <file> Writes the default configuration file
--help Shows options
--helphelp Shows all options (including advanced)
INI file documentation of this tool:
Legend:
required parameter
advanced parameter
+GNPSExportTool to export representative consensus MS/MS scan per consensusElement into a .MGF file format.
See the documentation on https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking_with_openms
version2.8.0
Version of the tool that generated this parameters file.
++1Instance '1' section for 'GNPSExport'
in_cm
Input consensusXML file containing only consensusElements with "peptide" annotations.input file*.consensusXML
in_mzml[]
Original mzml files containing the ms2 spectra (aka peptide annotation).
Must be in order that the consensusXML file maps the original mzML files.input file*.mzML
out
Output MGF fileoutput file*.mgf
output_typemost_intense
specificity of mgf output informationmerged_spectra,most_intense
peptide_cutoff5
Number of most intense peptides to consider per consensus element; '-1' to consider all identifications.-1:∞
ms2_bin_size0.019999999552965
Bin size (Da) for fragment ions when merging ms2 scans.0.0:∞
log
Name of log file (created only when specified)
debug0
Sets the debug level
threads1
Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse
Disables progress logging to command linetrue,false
forcefalse
Overrides tool-specific checkstrue,false
testfalse
Enables the test mode (needed for internal use only)true,false
+++merged_spectraOptions for exporting mgf file with merged spectra per consensusElement
cos_similarity0.9
Cosine similarity threshold for merged_spectra output.0.0:∞