![]() |
OpenMS
2.4.0
|
Resolves shared peptides based on protein scores. More...
#include <OpenMS/ANALYSIS/ID/PeptideProteinResolution.h>
Public Member Functions | |
PeptideProteinResolution (bool statistics=false) | |
void | buildGraph (ProteinIdentification &protein, const std::vector< PeptideIdentification > &peptides, bool skip_sort=false) |
void | resolveGraph (ProteinIdentification &protein, std::vector< PeptideIdentification > &peptides) |
ConnectedComponent | findConnectedComponent (Size &root_prot_grp) |
void | resolveConnectedComponent (ConnectedComponent &conn_comp, ProteinIdentification &protein, std::vector< PeptideIdentification > &peptides) |
Static Public Member Functions | |
static void | resolve (ProteinIdentification &protein, std::vector< PeptideIdentification > &peptides, bool resolve_ties, bool targets_first) |
Private Types | |
typedef std::map< Size, std::set< Size > > | IndexMap_ |
Private Attributes | |
IndexMap_ | indist_prot_grp_to_pep_ |
if the protein group at index i contains a target (first) and/or decoy (second) More... | |
IndexMap_ | pep_to_indist_prot_grp_ |
mapping indist. protein group indices <- peptide identification indices More... | |
std::map< String, Size > | prot_acc_to_indist_prot_grp_ |
bool | statistics_ |
log debug information? More... | |
Resolves shared peptides based on protein scores.
Resolves connected components of the bipartite protein-peptide graph based on protein probabilities/scores and adds them as additional protein_groups to the protein identification run processed. Thereby greedily assigns shared peptides in this component uniquely to the proteins of the current best indistinguishable protein group, until every peptide is uniquely assigned. This effectively allows more peptides to be used in ProteinQuantifier at the cost of potentially additional noise in the peptides quantities. In accordance with most state-of-the-art protein inference tools, only the best hit (PSM) for a peptide ID is considered. Probability ties are currently resolved by taking the first occurring protein of the component.
Implement probability tie resolution.
PeptideProteinResolution | ( | bool | statistics = false | ) |
Constructor
statistics | Specifies if the class stores/outputs info about statistics |
void buildGraph | ( | ProteinIdentification & | protein, |
const std::vector< PeptideIdentification > & | peptides, | ||
bool | skip_sort = false |
||
) |
Initialize and store the graph (= maps), needs sorted groups for correct functionality. Therefore sorts the indist. protein groups if not skipped.
protein | ProteinIdentification object storing IDs and groups |
peptides | vector of ProteinIdentifications with links to the proteins |
skip_sort | Skips sorting of groups, nothing is modified then. |
Referenced by TOPPBayesianProteinInference::main_(), and UTILProteomicsLFQ::main_().
ConnectedComponent findConnectedComponent | ( | Size & | root_prot_grp | ) |
Does a BFS on the two maps (= two parts of the graph; indist. prot. groups and peptides), switching from one to the other in each step.
root_prot_grp | Starts the BFS at this protein group index |
|
static |
void resolveConnectedComponent | ( | ConnectedComponent & | conn_comp, |
ProteinIdentification & | protein, | ||
std::vector< PeptideIdentification > & | peptides | ||
) |
Resolves connected components based on Fido probabilities and adds them as additional protein_groups to the output idXML. Thereby greedily assigns shared peptides in this component uniquely to the proteins of the current BEST INDISTINGUISHABLE protein group, ready to be used in ProteinQuantifier then. This is achieved by removing all other evidence from the input PeptideIDs and iterating until each peptide is uniquely assigned. In accordance with Fido only the best hit (PSM) for an ID is considered. Probability ties are _currently_ resolved by taking the first occurrence.
conn_comp | The component to be resolved |
protein | ProteinIdentification object storing IDs and groups |
peptides | vector of ProteinIdentifications with links to the proteins |
void resolveGraph | ( | ProteinIdentification & | protein, |
std::vector< PeptideIdentification > & | peptides | ||
) |
Applies resolveConnectedComponent to every component of the graph and is able to write statistics when specified. Parameters will both be mutated in this method.
protein | ProteinIdentification object storing IDs and groups |
peptides | vector of ProteinIdentifications with links to the proteins |
Referenced by TOPPBayesianProteinInference::main_().
|
private |
if the protein group at index i contains a target (first) and/or decoy (second)
mapping indist. protein group indices -> peptide identification indices
|
private |
mapping indist. protein group indices <- peptide identification indices
represents the middle layer of an implicit tripartite graph: consists of single protein accessions and their mapping to the (indist.) group's indices
|
private |
log debug information?