OpenMS
DBSuitability::SuitabilityData Struct Reference

struct to store results More...

#include <OpenMS/QC/DBSuitability.h>

Collaboration diagram for DBSuitability::SuitabilityData:
[legend]

Public Member Functions

void clear ()
 
void setCorrectionFactor (double factor)
 
double getCorrectionFactor () const
 
double getCorrectedNovoHits () const
 
double getCorrectedSuitability () const
 
SuitabilityData simulateNoReRanking () const
 Returns a SuitabilityData object containing the data if re-ranking didn't happen. More...
 

Public Attributes

Size num_top_novo = 0
 number of times the top hit is considered to be a deNovo hit More...
 
Size num_top_db = 0
 number of times the top hit is considered to be a database hit More...
 
Size num_interest = 0
 number of times a deNovo hit scored on top of a database hit More...
 
Size num_re_ranked = 0
 
double cut_off = DBL_MAX
 
double suitability = 0
 
double suitability_no_rerank = 0
 
double suitability_corr_no_rerank = 0
 the suitability after correcting the top deNovo hits, if re-ranking would have been disabled More...
 

Private Attributes

double corr_factor = -1
 
double num_top_novo_corr = 0
 number of top deNovo hits multiplied by the correction factor More...
 
double suitability_corr = 0
 

Detailed Description

struct to store results

Member Function Documentation

◆ clear()

void clear ( )

◆ getCorrectedNovoHits()

double getCorrectedNovoHits ( ) const

◆ getCorrectedSuitability()

double getCorrectedSuitability ( ) const

◆ getCorrectionFactor()

double getCorrectionFactor ( ) const

◆ setCorrectionFactor()

void setCorrectionFactor ( double  factor)

apply a correction factor to the already calculated suitability only works if num_top_db and num_top_novo contain a non-zero value

◆ simulateNoReRanking()

SuitabilityData simulateNoReRanking ( ) const

Returns a SuitabilityData object containing the data if re-ranking didn't happen.

Cases that are re-ranked are already counted. To get the 'no re-ranking' data these cases need to be subtracted from the number of top database hits and added to the number of top deNovo hits.

Returns
simulated suitability data where re-ranking didn't happen

Member Data Documentation

◆ corr_factor

double corr_factor = -1
private

#IDs with only deNovo search / #IDs with only database search used for correcting the number of deNovo hits worse databases will have less IDs than good databases this punishes worse databases more than good ones and will result in a worse suitability

◆ cut_off

double cut_off = DBL_MAX

the cut-off that was used to determine when a score difference was "small enough" this is normalized by mw

◆ num_interest

Size num_interest = 0

number of times a deNovo hit scored on top of a database hit

◆ num_re_ranked

Size num_re_ranked = 0

number of times a deNovo hit scored on top of a database hit, but their score difference was small enough, that it was still counted as a database hit

◆ num_top_db

Size num_top_db = 0

number of times the top hit is considered to be a database hit

◆ num_top_novo

Size num_top_novo = 0

number of times the top hit is considered to be a deNovo hit

◆ num_top_novo_corr

double num_top_novo_corr = 0
private

number of top deNovo hits multiplied by the correction factor

◆ suitability

double suitability = 0

the suitability of the database used for identification search, calculated with: #db_hits / (#db_hits + #deNovo_hit) can reach from 0 -> the database was not at all suited to 1 -> the perfect database was used

Preliminary tests have shown that databases of the right organism or close related organisms score around 0.9 to 0.95, organisms from the same class can still score around 0.8, organisms from the same phylum score around 0.5 to 0.6 and after that it quickly falls to suitabilities of 0.15 or even 0.05. Note that these test were only performed for one mzML and your results might differ.

◆ suitability_corr

double suitability_corr = 0
private

the suitability after correcting the top deNovo hits to impact worse databases more

The corrected suitability has a more linear behaviour. It basicly translates to the ratio of the theoretical perfect database the used database corresponds to. (i.e. a corrected suitability of 0.5 means the used database contains half the proteins of the 'perfect' database)

◆ suitability_corr_no_rerank

double suitability_corr_no_rerank = 0

the suitability after correcting the top deNovo hits, if re-ranking would have been disabled

◆ suitability_no_rerank

double suitability_no_rerank = 0

the suitability if re-ranking would have been turned off if re-ranking is actually turned off, this will be the same as the normal suitability