OpenMS
FeatureGroupingAlgorithmKD Class Reference

A feature grouping algorithm for unlabeled data. More...

#include <OpenMS/ANALYSIS/MAPMATCHING/FeatureGroupingAlgorithmKD.h>

Inheritance diagram for FeatureGroupingAlgorithmKD:
[legend]
Collaboration diagram for FeatureGroupingAlgorithmKD:
[legend]

Public Member Functions

 FeatureGroupingAlgorithmKD ()
 Default constructor. More...
 
 ~FeatureGroupingAlgorithmKD () override
 Destructor. More...
 
void group (const std::vector< FeatureMap > &maps, ConsensusMap &out) override
 Applies the algorithm to feature maps. More...
 
void group (const std::vector< ConsensusMap > &maps, ConsensusMap &out) override
 Applies the algorithm to consensus maps. More...
 
- Public Member Functions inherited from FeatureGroupingAlgorithm
 FeatureGroupingAlgorithm ()
 Default constructor. More...
 
 ~FeatureGroupingAlgorithm () override
 Destructor. More...
 
void transferSubelements (const std::vector< ConsensusMap > &maps, ConsensusMap &out) const
 Transfers subelements (grouped features) from input consensus maps to the result consensus map. More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void setLogger (ProgressLoggerImpl *logger)
 Sets the logger to be used for progress logging. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Private Member Functions

 FeatureGroupingAlgorithmKD (const FeatureGroupingAlgorithmKD &)
 Copy constructor intentionally not implemented -> private. More...
 
FeatureGroupingAlgorithmKDoperator= (const FeatureGroupingAlgorithmKD &)
 Assignment operator intentionally not implemented -> private. More...
 
template<typename MapType >
void group_ (const std::vector< MapType > &input_maps, ConsensusMap &out)
 Applies the algorithm to feature or consensus maps. More...
 
void runClustering_ (const KDTreeFeatureMaps &kd_data, ConsensusMap &out)
 Run the actual clustering algorithm. More...
 
void updateClusterProxies_ (std::set< ClusterProxyKD > &potential_clusters, std::vector< ClusterProxyKD > &cluster_for_idx, const std::set< Size > &update_these, const std::vector< Int > &assigned, const KDTreeFeatureMaps &kd_data)
 Update maximum possible sizes of potential consensus features for indices specified in update_these. More...
 
ClusterProxyKD computeBestClusterForCenter_ (Size i, std::vector< Size > &cf_indices, const std::vector< Int > &assigned, const KDTreeFeatureMaps &kd_data) const
 Compute the current best cluster with center index i (mutates proxy and cf_indices) More...
 
void addConsensusFeature_ (const std::vector< Size > &indices, const KDTreeFeatureMaps &kd_data, ConsensusMap &out) const
 Construct consensus feature and add to out map. More...
 

Private Attributes

SignedSize progress_
 Current progress for logging. More...
 
double rt_tol_secs_
 RT tolerance. More...
 
double mz_tol_
 m/z tolerance More...
 
bool mz_ppm_
 m/z unit ppm? More...
 
FeatureDistance feature_distance_
 Feature distance functor. More...
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Protected Member Functions inherited from FeatureGroupingAlgorithm
template<class MapType >
void postprocess_ (const std::vector< MapType > &maps, ConsensusMap &out)
 
- Protected Member Functions inherited from DefaultParamHandler
virtual void updateMembers_ ()
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

A feature grouping algorithm for unlabeled data.

The algorithm takes a number of feature or consensus maps and searches for corresponding (consensus) features across different maps.

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
mz_unit stringppm ppm, DaUnit of m/z tolerance
nr_partitions int100 min: 1Number of partitions in m/z space
warp:enabled stringtrue true, falseWhether or not to internally warp feature RTs using LOWESS transformation before linking (reported RTs in results will always be the original RTs)
warp:rt_tol float100.0 min: 0.0Width of RT tolerance window (sec)
warp:mz_tol float5.0 min: 0.0m/z tolerance (in ppm or Da)
warp:max_pairwise_log_fc float0.5  Maximum absolute log10 fold change between two compatible signals during compatibility graph construction. Two signals from different maps will not be connected by an edge in the compatibility graph if absolute log fold change exceeds this limit (they might still end up in the same connected component, however). Note: this does not limit fold changes in the linking stage, only during RT alignment, where we try to find high-quality alignment anchor points. Setting this to a value < 0 disables the FC check.
warp:min_rel_cc_size float0.5 min: 0.0 max: 1.0Only connected components containing compatible features from at least max(2, (warp_min_occur * number_of_input_maps)) input maps are considered for computing the warping function
warp:max_nr_conflicts int0 min: -1Allow up to this many conflicts (features from the same map) per connected component to be used for alignment (-1 means allow any number of conflicts)
link:rt_tol float30.0 min: 0.0Width of RT tolerance window (sec)
link:mz_tol float10.0 min: 0.0m/z tolerance (in ppm or Da)
link:charge_merging stringWith_charge_zero Identical, With_charge_zero, Anywhether to disallow charge mismatches (Identical), allow to link charge zero (i.e., unknown charge state) with every charge state, or disregard charges (Any).
link:adduct_merging stringAny Identical, With_unknown_adducts, Anywhether to only allow the same adduct for linking (Identical), also allow linking features with adduct-free ones, or disregard adducts (Any).
distance_RT:exponent float1.0 min: 0.0Normalized RT differences ([0-1], relative to 'max_difference') are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)
distance_RT:weight float1.0 min: 0.0Final RT distances are weighted by this factor
distance_MZ:exponent float2.0 min: 0.0Normalized ([0-1], relative to 'max_difference') m/z differences are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)
distance_MZ:weight float1.0 min: 0.0Final m/z distances are weighted by this factor
distance_intensity:exponent float1.0 min: 0.0Differences in relative intensity ([0-1]) are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)
distance_intensity:weight float1.0 min: 0.0Final intensity distances are weighted by this factor
distance_intensity:log_transform stringenabled enabled, disabledLog-transform intensities? If disabled, d = |int_f2 - int_f1| / int_max. If enabled, d = |log(int_f2 + 1) - log(int_f1 + 1)| / log(int_max + 1))
LOWESS:span float0.666666666666667 min: 0.0 max: 1.0Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit.
LOWESS:num_iterations int3 min: 0Number of robustifying iterations for lowess fitting.
LOWESS:delta float-1.0  Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this.
LOWESS:interpolation_type stringcspline linear, cspline, akimaMethod to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolation
LOWESS:extrapolation_type stringfour-point-linear two-point-linear, four-point-linear, global-linearMethod to use for extrapolation outside the data range. 'two-point-linear': Uses a line through the first and last point to extrapolate. 'four-point-linear': Uses a line through the first and second point to extrapolate in front and and a line through the last and second-to-last point in the end. 'global-linear': Uses a linear regression to fit a line through all data points and use it for interpolation.

Note:
  • If a section name is documented, the documentation is displayed as tooltip.
  • Advanced parameter names are italic.

Constructor & Destructor Documentation

◆ FeatureGroupingAlgorithmKD() [1/2]

Default constructor.

◆ ~FeatureGroupingAlgorithmKD()

Destructor.

◆ FeatureGroupingAlgorithmKD() [2/2]

Copy constructor intentionally not implemented -> private.

Member Function Documentation

◆ addConsensusFeature_()

void addConsensusFeature_ ( const std::vector< Size > &  indices,
const KDTreeFeatureMaps kd_data,
ConsensusMap out 
) const
private

Construct consensus feature and add to out map.

◆ computeBestClusterForCenter_()

ClusterProxyKD computeBestClusterForCenter_ ( Size  i,
std::vector< Size > &  cf_indices,
const std::vector< Int > &  assigned,
const KDTreeFeatureMaps kd_data 
) const
private

Compute the current best cluster with center index i (mutates proxy and cf_indices)

◆ group() [1/2]

void group ( const std::vector< ConsensusMap > &  maps,
ConsensusMap out 
)
overridevirtual

Applies the algorithm to consensus maps.

Exceptions
IllegalArgumentis thrown if less than two input maps are given.

Reimplemented from FeatureGroupingAlgorithm.

◆ group() [2/2]

void group ( const std::vector< FeatureMap > &  maps,
ConsensusMap out 
)
overridevirtual

Applies the algorithm to feature maps.

Exceptions
IllegalArgumentis thrown if less than two input maps are given.

Implements FeatureGroupingAlgorithm.

◆ group_()

void group_ ( const std::vector< MapType > &  input_maps,
ConsensusMap out 
)
private

Applies the algorithm to feature or consensus maps.

Exceptions
IllegalArgumentis thrown if less than two input maps are given.

◆ operator=()

FeatureGroupingAlgorithmKD& operator= ( const FeatureGroupingAlgorithmKD )
private

Assignment operator intentionally not implemented -> private.

◆ runClustering_()

void runClustering_ ( const KDTreeFeatureMaps kd_data,
ConsensusMap out 
)
private

Run the actual clustering algorithm.

◆ updateClusterProxies_()

void updateClusterProxies_ ( std::set< ClusterProxyKD > &  potential_clusters,
std::vector< ClusterProxyKD > &  cluster_for_idx,
const std::set< Size > &  update_these,
const std::vector< Int > &  assigned,
const KDTreeFeatureMaps kd_data 
)
private

Update maximum possible sizes of potential consensus features for indices specified in update_these.

Member Data Documentation

◆ feature_distance_

FeatureDistance feature_distance_
private

Feature distance functor.

◆ mz_ppm_

bool mz_ppm_
private

m/z unit ppm?

◆ mz_tol_

double mz_tol_
private

m/z tolerance

◆ progress_

SignedSize progress_
private

Current progress for logging.

◆ rt_tol_secs_

double rt_tol_secs_
private

RT tolerance.