OpenMS
BayesianProteinInferenceAlgorithm Class Reference

Performs a Bayesian protein inference on Protein/Peptide identifications or ConsensusMap (experimental). More...

#include <OpenMS/ANALYSIS/ID/BayesianProteinInferenceAlgorithm.h>

Inheritance diagram for BayesianProteinInferenceAlgorithm:
[legend]
Collaboration diagram for BayesianProteinInferenceAlgorithm:
[legend]

Public Member Functions

 BayesianProteinInferenceAlgorithm (unsigned int debug_lvl=0)
 Constructor. More...
 
 ~BayesianProteinInferenceAlgorithm () override=default
 Destructor. More...
 
void updateMembers_ () override
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
void inferPosteriorProbabilities (std::vector< ProteinIdentification > &proteinIDs, std::vector< PeptideIdentification > &peptideIDs, bool greedy_group_resolution, std::optional< const ExperimentalDesign > exp_des=std::optional< const ExperimentalDesign >())
 Perform inference. Filter, build graph, run the private inferPosteriorProbabilities_ function. Writes its results into protein and (optionally also) peptide hits (as new score). Optionally adds indistinguishable protein groups with separate scores, too. Output scores are always posterior probabilities. Input can be posterior or error probabilities. See Param object defaults_ within the BayesianProteinInferenceAlgorithm for more settings. Currently only takes first proteinID run and all peptides (irrespective of getIdentifier()). More...
 
void inferPosteriorProbabilities (ConsensusMap &cmap, bool greedy_group_resolution, std::optional< const ExperimentalDesign > exp_des=std::optional< const ExperimentalDesign >())
 Perform inference. Filter, build graph, run the private inferPosteriorProbabilities_ function. Writes its results into protein and (optionally also) peptide hits (as new score). Optionally adds indistinguishable protein groups with separate scores, too. Output scores are always posterior probabilities. Input can be posterior or error probabilities. See Param object defaults_ within the BayesianProteinInferenceAlgorithm for more settings. Currently only takes first proteinID run and all peptides (irrespective of getIdentifier()). More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress () const
 Ends the progress display. More...
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Private Member Functions

void inferPosteriorProbabilities_ (Internal::IDBoostGraph &ibg)
 
GridSearch< double, double, double > initGridSearchFromParams_ (std::vector< double > &alpha_search, std::vector< double > &beta_search, std::vector< double > &gamma_search)
 read Param object and set the grid More...
 
void setScoreTypeAndSettings_ (ProteinIdentification &proteinIDs)
 set score type and settings for every ProteinID run processed More...
 
void resetProteinScores_ (ProteinIdentification &protein_id, bool keep_old_as_prior)
 reset all protein scores to 0.0, save old ones as Prior MetaValue if requested More...
 

Private Attributes

std::function< void(PeptideIdentification &)> checkConvertAndFilterPepHits_
 
unsigned int debug_lvl_
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Static Protected Member Functions inherited from ProgressLogger
static String logTypeToFactoryName_ (LogType type)
 Return the name of the factory product used for this log type. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

Performs a Bayesian protein inference on Protein/Peptide identifications or ConsensusMap (experimental).

  • Filters for best n PSMs per spectrum.
  • Calculates and filters for best peptide per spectrum.
  • Builds a k-partite graph from the structures.
  • Finds and splits into connected components by DFS
  • Extends the graph by adding layers from indist. protein groups, peptides with the same parents and optionally some additional layers (peptide sequence, charge, replicate -> extended model = experimental)
  • Builds a factor graph representation of a Bayesian network using the Evergreen library See model param section. It is based on the Fido noisy-OR model with an option for regularizing the number of proteins per peptide.
  • Performs loopy belief propagation on the graph and queries protein, protein group and/or peptide posteriors See loopy_belief_propagation param section.
  • Learns best parameters via grid search if the parameters were not given in the param section.
  • Writes posteriors to peptides and/or proteins and adds indistinguishable protein groups to the underlying data structures.
  • Can make use of OpenMP to parallelize over connected components.

Constructor & Destructor Documentation

◆ BayesianProteinInferenceAlgorithm()

BayesianProteinInferenceAlgorithm ( unsigned int  debug_lvl = 0)
explicit

Constructor.

Todo:
is there a better way to pass the debug level from TOPPBase?

◆ ~BayesianProteinInferenceAlgorithm()

~BayesianProteinInferenceAlgorithm ( )
overridedefault

Destructor.

Member Function Documentation

◆ inferPosteriorProbabilities() [1/2]

void inferPosteriorProbabilities ( ConsensusMap cmap,
bool  greedy_group_resolution,
std::optional< const ExperimentalDesign exp_des = std::optional< const ExperimentalDesign >() 
)

Perform inference. Filter, build graph, run the private inferPosteriorProbabilities_ function. Writes its results into protein and (optionally also) peptide hits (as new score). Optionally adds indistinguishable protein groups with separate scores, too. Output scores are always posterior probabilities. Input can be posterior or error probabilities. See Param object defaults_ within the BayesianProteinInferenceAlgorithm for more settings. Currently only takes first proteinID run and all peptides (irrespective of getIdentifier()).

Parameters
cmapFeatures with input/output peptides and proteins (from getProteinIdentifications)
greedy_group_resolutionDo greedy group resolution? Remove all but best association for "razor" peptides.
exp_desExperimental design can be used to create an extended graph with replicate information. (experimental)

◆ inferPosteriorProbabilities() [2/2]

void inferPosteriorProbabilities ( std::vector< ProteinIdentification > &  proteinIDs,
std::vector< PeptideIdentification > &  peptideIDs,
bool  greedy_group_resolution,
std::optional< const ExperimentalDesign exp_des = std::optional< const ExperimentalDesign >() 
)

Perform inference. Filter, build graph, run the private inferPosteriorProbabilities_ function. Writes its results into protein and (optionally also) peptide hits (as new score). Optionally adds indistinguishable protein groups with separate scores, too. Output scores are always posterior probabilities. Input can be posterior or error probabilities. See Param object defaults_ within the BayesianProteinInferenceAlgorithm for more settings. Currently only takes first proteinID run and all peptides (irrespective of getIdentifier()).

Parameters
proteinIDsInput/output proteins
peptideIDsInput/output peptides
greedy_group_resolutionDo greedy group resolution? Remove all but best association for "razor" peptides.
exp_desExperimental design can be used to create an extended graph with replicate information. (experimental)
Todo:
loop over all runs

◆ inferPosteriorProbabilities_()

void inferPosteriorProbabilities_ ( Internal::IDBoostGraph ibg)
private

after a graph was built, use this method to perform inference and write results to the structures with which the graph was built

◆ initGridSearchFromParams_()

GridSearch<double,double,double> initGridSearchFromParams_ ( std::vector< double > &  alpha_search,
std::vector< double > &  beta_search,
std::vector< double > &  gamma_search 
)
private

read Param object and set the grid

◆ resetProteinScores_()

void resetProteinScores_ ( ProteinIdentification protein_id,
bool  keep_old_as_prior 
)
private

reset all protein scores to 0.0, save old ones as Prior MetaValue if requested

◆ setScoreTypeAndSettings_()

void setScoreTypeAndSettings_ ( ProteinIdentification proteinIDs)
private

set score type and settings for every ProteinID run processed

◆ updateMembers_()

void updateMembers_ ( )
overridevirtual

This method is used to update extra member variables at the end of the setParameters() method.

Also call it at the end of the derived classes' copy constructor and assignment operator.

The default implementation is empty.

Reimplemented from DefaultParamHandler.

Member Data Documentation

◆ checkConvertAndFilterPepHits_

std::function<void(PeptideIdentification&)> checkConvertAndFilterPepHits_
private

function initialized based on the algorithm parameters that is used to filter PeptideHits

Todo:
extend to allow filtering only for the current run

◆ debug_lvl_

unsigned int debug_lvl_
private