OpenMS
|
Method for the assembly of mass traces belonging to the same isotope pattern, i.e., that are compatible in retention times, mass-to-charge ratios, and isotope abundances. More...
#include <OpenMS/FILTERING/DATAREDUCTION/FeatureFindingMetabo.h>
Public Member Functions | |
FeatureFindingMetabo () | |
Default constructor. More... | |
~FeatureFindingMetabo () override | |
Default destructor. More... | |
void | run (std::vector< MassTrace > &input_mtraces, FeatureMap &output_featmap, std::vector< std::vector< OpenMS::MSChromatogram > > &output_chromatograms) |
main method of FeatureFindingMetabo More... | |
Public Member Functions inherited from DefaultParamHandler | |
DefaultParamHandler (const String &name) | |
Constructor with name that is displayed in error messages. More... | |
DefaultParamHandler (const DefaultParamHandler &rhs) | |
Copy constructor. More... | |
virtual | ~DefaultParamHandler () |
Destructor. More... | |
DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
Assignment operator. More... | |
virtual bool | operator== (const DefaultParamHandler &rhs) const |
Equality operator. More... | |
void | setParameters (const Param ¶m) |
Sets the parameters. More... | |
const Param & | getParameters () const |
Non-mutable access to the parameters. More... | |
const Param & | getDefaults () const |
Non-mutable access to the default parameters. More... | |
const String & | getName () const |
Non-mutable access to the name. More... | |
void | setName (const String &name) |
Mutable access to the name. More... | |
const std::vector< String > & | getSubsections () const |
Non-mutable access to the registered subsections. More... | |
Public Member Functions inherited from ProgressLogger | |
ProgressLogger () | |
Constructor. More... | |
virtual | ~ProgressLogger () |
Destructor. More... | |
ProgressLogger (const ProgressLogger &other) | |
Copy constructor. More... | |
ProgressLogger & | operator= (const ProgressLogger &other) |
Assignment Operator. More... | |
void | setLogType (LogType type) const |
Sets the progress log that should be used. The default type is NONE! More... | |
LogType | getLogType () const |
Returns the type of progress log being used. More... | |
void | startProgress (SignedSize begin, SignedSize end, const String &label) const |
Initializes the progress display. More... | |
void | setProgress (SignedSize value) const |
Sets the current progress. More... | |
void | endProgress () const |
Ends the progress display. More... | |
void | nextProgress () const |
increment progress by 1 (according to range begin-end) More... | |
Protected Member Functions | |
void | updateMembers_ () override |
This method is used to update extra member variables at the end of the setParameters() method. More... | |
Protected Member Functions inherited from DefaultParamHandler | |
void | defaultsToParam_ () |
Updates the parameters after the defaults have been set in the constructor. More... | |
Private Member Functions | |
std::vector< const Element * > | elementsFromString_ (const std::string &elements_string) const |
parses a string of element symbols into a vector of Elements More... | |
Range | getTheoreticIsotopicMassWindow_ (const std::vector< Element const * > &alphabet, int peakOffset) const |
double | computeCosineSim_ (const std::vector< double > &, const std::vector< double > &) const |
Computes the cosine similarity between two vectors. More... | |
int | isLegalIsotopePattern_ (const FeatureHypothesis &feat_hypo) const |
Compare intensities of feature hypothesis with model. More... | |
void | loadIsotopeModel_ (const String &) |
double | scoreMZ_ (const MassTrace &, const MassTrace &, Size isotopic_position, Size charge, Range isotope_window) const |
Perform mass to charge scoring of two multiple mass traces. More... | |
double | scoreMZByExpectedMean_ (Size iso_pos, Size charge, const double diff_mz, double mt_variances) const |
score isotope m/z distance based on the expected m/z distances using C13-C12 or Kenar method More... | |
double | scoreMZByExpectedRange_ (Size charge, const double diff_mz, double mt_variances, Range isotope_window) const |
score isotope m/z distance based on an expected isotope window which was calculated from a set of expected elements More... | |
double | scoreRT_ (const MassTrace &, const MassTrace &) const |
Perform retention time scoring of two multiple mass traces. More... | |
double | computeAveragineSimScore_ (const std::vector< double > &intensities, const double &molecular_weight) const |
Perform intensity scoring using the averagine model (for peptides only) More... | |
void | findLocalFeatures_ (const std::vector< const MassTrace * > &candidates, double total_intensity, std::vector< FeatureHypothesis > &output_hypotheses) const |
Identify groupings of mass traces based on a set of reasonable candidates. More... | |
Private Attributes | |
svm_model * | isotope_filt_svm_ = nullptr |
SVM parameters. More... | |
std::vector< double > | svm_feat_centers_ |
std::vector< double > | svm_feat_scales_ |
double | local_rt_range_ |
parameter stuff More... | |
double | local_mz_range_ |
Size | charge_lower_bound_ |
Size | charge_upper_bound_ |
double | chrom_fwhm_ |
bool | report_summed_ints_ |
bool | enable_RT_filtering_ |
String | isotope_filtering_model_ |
bool | use_smoothed_intensities_ |
bool | use_mz_scoring_C13_ |
bool | use_mz_scoring_by_element_range_ |
bool | report_convex_hulls_ |
bool | report_chromatograms_ |
bool | remove_single_traces_ |
std::vector< const Element * > | elements_ |
Additional Inherited Members | |
Public Types inherited from ProgressLogger | |
enum | LogType { CMD , GUI , NONE } |
Possible log types. More... | |
Static Public Member Functions inherited from DefaultParamHandler | |
static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
Writes all parameters to meta values. More... | |
Static Protected Member Functions inherited from ProgressLogger | |
static String | logTypeToFactoryName_ (LogType type) |
Return the name of the factory product used for this log type. More... | |
Protected Attributes inherited from DefaultParamHandler | |
Param | param_ |
Container for current parameters. More... | |
Param | defaults_ |
Container for default parameters. This member should be filled in the constructor of derived classes! More... | |
std::vector< String > | subsections_ |
Container for registered subsections. This member should be filled in the constructor of derived classes! More... | |
String | error_name_ |
Name that is displayed in error messages during the parameter checking. More... | |
bool | check_defaults_ |
If this member is set to false no checking if parameters in done;. More... | |
bool | warn_empty_defaults_ |
If this member is set to false no warning is emitted when defaults are empty;. More... | |
Protected Attributes inherited from ProgressLogger | |
LogType | type_ |
time_t | last_invoke_ |
ProgressLoggerImpl * | current_logger_ |
Static Protected Attributes inherited from ProgressLogger | |
static int | recursion_depth_ |
Method for the assembly of mass traces belonging to the same isotope pattern, i.e., that are compatible in retention times, mass-to-charge ratios, and isotope abundances.
In FeatureFindingMetabo, mass traces detected by the MassTraceDetection method and afterwards split into individual chromatographic peaks by the ElutionPeakDetection method are assembled to composite features if they are compatible with respect to RTs, m/z ratios, and isotopic intensities. To this end, feature hypotheses are formulated exhaustively based on the set of mass traces detected within a local RT and m/z region. These feature hypotheses are scored by their similarity to real metabolite isotope patterns. The score is derived from independent models for retention time shifts and m/z differences between isotopic mass traces. Hypotheses with correct or false isotopic abundances are distinguished by a SVM model. Mass traces that could not be assembled or low-intensity metabolites with only a monoisotopic mass trace to observe are left in the resulting FeatureMap as singletons with the undefined charge state of 0.
Reference: Kenar et al., doi: 10.1074/mcp.M113.031278
Parameters of this class are:Name | Type | Default | Restrictions | Description |
---|---|---|---|---|
local_rt_range | float | 10.0 | RT range where to look for coeluting mass traces | |
local_mz_range | float | 6.5 | MZ range where to look for isotopic mass traces | |
charge_lower_bound | int | 1 | Lowest charge state to consider | |
charge_upper_bound | int | 3 | Highest charge state to consider | |
chrom_fwhm | float | 5.0 | Expected chromatographic peak width (in seconds). | |
report_summed_ints | string | false | false, true | Set to true for a feature intensity summed up over all traces rather than using monoisotopic trace intensity alone. |
enable_RT_filtering | string | true | false, true | Require sufficient overlap in RT while assembling mass traces. Disable for direct injection data.. |
isotope_filtering_model | string | metabolites (5% RMS) | metabolites (2% RMS), metabolites (5% RMS), peptides, none | Remove/score candidate assemblies based on isotope intensities. SVM isotope models for metabolites were trained with either 2% or 5% RMS error. For peptides, an averagine cosine scoring is used. Select the appropriate noise model according to the quality of measurement or MS device. |
mz_scoring_13C | string | false | false, true | Use the 13C isotope peak position (~1.003355 Da) as the expected shift in m/z for isotope mass traces (highly recommended for lipidomics!). Disable for general metabolites (as described in Kenar et al. 2014, MCP.). |
use_smoothed_intensities | string | true | false, true | Use LOWESS intensities instead of raw intensities. |
report_convex_hulls | string | false | false, true | Augment each reported feature with the convex hull of the underlying mass traces (increases featureXML file size considerably). |
report_chromatograms | string | false | false, true | Adds Chromatogram for each reported feature (Output in mzml). |
remove_single_traces | string | false | false, true | Remove unassembled traces (single traces). |
mz_scoring_by_elements | string | false | false, true | Use the m/z range of the assumed elements to detect isotope peaks. A expected m/z range is computed from the isotopes of the assumed elements. If enabled, this ignores 'mz_scoring_13C' |
elements | string | CHNOPS | Elements assumes to be present in the sample (this influences isotope detection). |
Default constructor.
|
override |
Default destructor.
|
private |
Perform intensity scoring using the averagine model (for peptides only)
Compare the isotopic intensity distribution with the theoretical one expected for peptides, using the averagine model. Compute the cosine similarity between the two values.
|
private |
Computes the cosine similarity between two vectors.
The cosine similarity (or cosine distance) is the cosine of the angle between two vectors or the normalized dot product of two vectors.
|
private |
parses a string of element symbols into a vector of Elements
elements_string | string of element symbols without whitespaces or commas. e.g. CHNOPSCl |
|
private |
Identify groupings of mass traces based on a set of reasonable candidates.
Takes a set of reasonable candidates for mass trace grouping and checks all combinations of charge and isotopic positions on the candidates. It is assumed that candidates[0] is the monoisotopic trace.
The resulting possible groupings are appended to output_hypotheses.
|
private |
Calculate the maximal and minimal mass defects of isotopes for a given set of elements.
alphabet | chemical alphabet (elements which are expected to be present) |
peakOffset | integer distance between isotope peak and monoisotopic peak (minimum: 1) |
|
private |
Compare intensities of feature hypothesis with model.
Use a pre-trained SVM model to evaluate the intensity distribution of a given feature hypothesis. The model is trained on the monoisotopic and the first tree isotopic traces of each feature and uses the scaled ratios between the traces as input.
Reference: Kenar et al., doi: 10.1074/mcp.M113.031278
feat_hypo | A feature hypotheses containing mass traces |
|
private |
void run | ( | std::vector< MassTrace > & | input_mtraces, |
FeatureMap & | output_featmap, | ||
std::vector< std::vector< OpenMS::MSChromatogram > > & | output_chromatograms | ||
) |
main method of FeatureFindingMetabo
|
private |
Perform mass to charge scoring of two multiple mass traces.
Scores two mass traces based on the m/z and the hypothesis that one trace is an isotopic trace of the other one. The isotopic position (which trace it is) and the charge for the hypothesis are given as additional parameters. The scoring is described in Kenar et al., and is based on a random sample of 115 000 compounds drawn from a comprehensive set of 24 million putative sum formulas, of which the isotopic distribution was accurately calculated. Thus, a theoretical mu and sigma are calculated as:
mu = 1.000857 * j + 0.001091 u sigma = 0.0016633 j * 0.0004751
where j is the isotopic peak considered. A similarity score based on agreement with the model is then computed.
Reference: Kenar et al., doi: 10.1074/mcp.M113.031278
An alternative scoring was added which test if isotope m/z distances lie in an expected m/z window. This window is computed from a given set of elements.
|
private |
score isotope m/z distance based on the expected m/z distances using C13-C12 or Kenar method
iso_pos | |
charge | |
diff_mz | |
mt_variances |
|
private |
score isotope m/z distance based on an expected isotope window which was calculated from a set of expected elements
charge | |
diff_mz | |
mt_variances | m/z variance between the two mass traces which are compared |
isotope_window |
Perform retention time scoring of two multiple mass traces.
Computes the similarity of the two peak shapes using cosine similarity (see computeCosineSim_) if some conditions are fulfilled. Mainly the overlap between the two peaks at FHWM needs to exceed a certain threshold. The threshold is set at 0.7 (i.e. 70 % overlap) as also described in Kenar et al.
|
overrideprotectedvirtual |
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
SVM parameters.
|
private |
|
private |
|
private |
parameter stuff
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |