OpenMS
|
EPIFANY - Efficient protein inference for any peptide-protein network is a Bayesian protein inference engine. It uses PSM (posterior) probabilities from Percolator, OpenMS' IDPosteriorErrorProbability or similar tools to calculate posterior probabilities for proteins and protein groups.
pot. predecessor tools | → Epifany → | pot. successor tools |
---|---|---|
PercolatorAdapter | IDFilter | |
IDPosteriorErrorProbability |
It is a protein inference engine based on a Bayesian network. Currently the same model like Fido is used with the main parameters alpha (pep_emission), beta (pep_spurious_emission) and gamma (prot_prior). If not specified, these parameters are trained based on their classification performance and calibration via a grid search by simply running with several possible combinations and evaluating. Unless you see very extreme output probabilities (e.g. many close to 1.0) or you know good parameters (e.g. from an earlier run), grid search is recommended although slower. The tool will merge multiple idXML files (union of proteins and concatenation of PSMs) when given more than one. It assumes one search engine run per input file but might work on more. Proteins need to be indexed by OpenMS's PeptideIndexer but this is usually done before Percolator/IDPEP since target/decoy associations are needed there already. Make sure that the input PSM probabilities are not too extreme already (garbage in - garbage out). After merging the input probabilities are preprocessed with a low posterior probability cutoff to neglect very unreliable matches. Then the probabilities are aggregated with the maximum per peptide and the graph is built and split into connected components. When compiled with the OpenMP flag (default enabled in the release binaries) the tool is multi-threaded which can be activated at runtime by the threads parameter. Note that peak memory requirements may rise significantly when processing multiple components of the graph at the same time.
The command line parameters of this tool are:
Epifany -- Runs a Bayesian protein inference. Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_Epifany.html Version: 3.3.0-pre-nightly-2024-11-20 Nov 21 2024, 02:34:56, Revision: decb5c8 To cite OpenMS: + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7. Usage: Epifany <options> This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript ion or use the --helphelp option Options (mandatory options marked with '*'): -in <file>* Input: identification results (valid formats: 'idXML', 'consensusXML ') -exp_design <file> (Currently unused) Input: experimental design (valid formats: 'tsv') -out <file>* Output: identification results with scored/grouped proteins (valid formats: 'idXML', 'consensusXML') -out_type <file> Output type: auto detected by file extension but can be overwritten here. (valid: 'idXML', 'consensusXML') -protein_fdr <option> Additionally calculate the target-decoy FDR on protein-level based on the posteriors (default: 'false') (valid: 'true', 'false') -greedy_group_resolution <option> Post-process inference output with greedy resolution of shared pepti des based on the parent protein probabilities. Also adds the resolve d ambiguity groups to output. (default: 'none') (valid: 'none', 'rem ove_associations_only', 'remove_proteins_wo_evidence') -max_psms_extreme_probability <float> Set PSMs with probability higher than this to this maximum probabili ty. (default: '1.0') Common TOPP options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (defa ult: '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced) The following configuration subsections are valid: - algorithm Parameters for the Algorithm section You can write an example INI file using the '-write_ini' option. Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor. For more information, please consult the online documentation for this tool: - http://www.openms.de/doxygen/nightly/html/TOPP_Epifany.html
INI file documentation of this tool: