OpenMS
SvmTheoreticalSpectrumGeneratorTrainer Class Reference

Train SVM models that are used by SvmTheoreticalSpectrumGenerator. More...

#include <OpenMS/CHEMISTRY/SvmTheoreticalSpectrumGeneratorTrainer.h>

Inheritance diagram for SvmTheoreticalSpectrumGeneratorTrainer:
[legend]
Collaboration diagram for SvmTheoreticalSpectrumGeneratorTrainer:
[legend]

Private Types

typedef SvmTheoreticalSpectrumGenerator::IonType IonType
 
typedef SvmTheoreticalSpectrumGenerator::DescriptorSet DescriptorSet
 
typedef std::map< std::pair< IonType, Size >, std::vector< double > > ObservedIntensMap
 

Private Member Functions

void countIntensities_ (const PeakSpectrum &spectrum, const AASequence &annotation, const IonType &type, std::map< std::pair< IonType, Size >, std::vector< double > > &observed_intensities, double tolerance, Size number_of_regions)
 stores the observed intensities for each sector-type combination in a vector More...
 
void trainSecondaryTypes_ (TextFile &info_outfile, Size number_of_regions, Size number_of_intensity_levels, ObservedIntensMap &observed_intensities, const std::vector< IonType > &ion_types, const std::vector< bool > &is_primary)
 trains the Bayesian secondary peak types models More...
 

Constructors and Destructors

 SvmTheoreticalSpectrumGeneratorTrainer ()
 Default constructor. More...
 
 SvmTheoreticalSpectrumGeneratorTrainer (const SvmTheoreticalSpectrumGeneratorTrainer &source)
 Copy constructor. More...
 
 ~SvmTheoreticalSpectrumGeneratorTrainer () override
 Destructor. More...
 
SvmTheoreticalSpectrumGeneratorTraineroperator= (const SvmTheoreticalSpectrumGeneratorTrainer &tsg)
 Assignment operator. More...
 
void trainModel (const PeakMap &spectra, const std::vector< AASequence > &annotations, const String &filename, Int precursor_charge)
 trains an SVM for each ion_type and stores them in files <filename>_residue_loss_charge.svm More...
 
void normalizeIntensity (PeakSpectrum &S) const
 Normalizes the intensity of the peaks in the input data. More...
 
void writeTrainingFile_ (std::vector< DescriptorSet > &training_input, std::vector< double > &training_output, const String &filename)
 Write a training file that can be passed to libsvm command line tools. More...
 

Additional Inherited Members

- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Protected Member Functions inherited from DefaultParamHandler
virtual void updateMembers_ ()
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 

Detailed Description

Train SVM models that are used by SvmTheoreticalSpectrumGenerator.

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
write_training_files stringfalse true, falseIf set to true no models are trained but files (__training.dat) are produced for the selected primary ion types. They can be used as input for LibSVM command line tools
number_intensity_levels int7  The number of intensity bins (for secondary type models)
number_regions int3  The number of regions each spectrum is split to (for secondary type models)
parent_tolerance float2.5  The maximum difference between theoretical and experimental parent mass to accept training spectrum
peak_tolerance float0.5  The maximum mass error for a peak to the expected mass of some ion type
add_b_ions stringtrue true, falseTrain simulator for b-ions
add_y_ions stringtrue true, falseTrain simulator for y-ions
add_a_ions stringfalse true, falseTrain simulator for a-ions
add_c_ions stringfalse true, falseTrain simulator for c-ions
add_x_ions stringfalse true, falseTrain simulator for x-ions
add_z_ions stringfalse true, falseTrain simulator for z-ions
add_losses stringfalse true, falseTrain simulator for neutral losses of H2O and NH3 for b-ions and y-ions
add_b2_ions stringfalse true, falseTrain simulator for doubly charged b-ions
add_y2_ions stringfalse true, falseTrain simulator for double charged y-ions
svm:svc_type int0 min: 0 max: 1Type of the SVC: 0=C_SVC 1=NU_SVC
svm:svr_type int1 min: 0 max: 1Type of the SVR: 0=EPSILON_SVR 1=NU_SVR
svm:scaling stringtrue true, falseApply scaling of feature values
svm:scaling_lower float0.0  Lower bound for scaling
svm:scaling_upper float1.0  Upper bound for scaling
svm:n_fold int5 min: 1n_fold cross validation is performed
svm:grid stringfalse true, falsePerform grid search
svm:additive_cv stringfalse true, falseAdditive step size (if false multiplicative)
svm:svc:kernel_type int2 min: 0 max: 3Type of the kernel: 0=LINEAR 1=POLY 2=RBF 3=SIGMOID
svm:svc:degree int3 min: 1For POLY
svm:svc:gamma float0.0 min: 0.0For POLY/RBF/SIGMOID
svm:svc:C float1.0  Cost of constraint violation
svm:svc:nu float0.5  For NU_SVC, ONE_CLASS and NU_SVR
svm:svc:balancing stringtrue true, falseUse class balanced SVC training
svm:svc:degree_start int1 min: 1starting point of degree
svm:svc:degree_step_size int2  step size point of degree
svm:svc:degree_stop int4  stopping point of degree
svm:svc:gamma_start float1.0e-05 min: 0.0 max: 1.0starting point of gamma
svm:svc:gamma_step_size int100  step size point of gamma
svm:svc:gamma_stop float0.1  stopping point of gamma
svm:svc:c_start float0.1  starting point of c
svm:svc:c_step_size int100  step size of c
svm:svc:c_stop int1000  stopping point of c
svm:svc:nu_start float0.3 min: 0.0 max: 1.0starting point of nu
svm:svc:nu_step_size int2  step size of nu
svm:svc:nu_stop float0.6 min: 0.0 max: 1.0stopping point of nu
svm:svr:kernel_type int2 min: 0 max: 3Type of the kernel: 0=LINEAR 1=POLY 2=RBF 3=SIGMOID
svm:svr:degree int3 min: 1For POLY
svm:svr:gamma float0.0 min: 0.0For POLY/RBF/SIGMOID
svm:svr:C float1.0  Cost of constraint violation
svm:svr:p float0.1  The epsilon for the loss function in epsilon-SVR
svm:svr:nu float0.5  For NU_SVC, ONE_CLASS and NU_SVR
svm:svr:degree_start int1 min: 1starting point of degree
svm:svr:degree_step_size int2  step size point of degree
svm:svr:degree_stop int4  stopping point of degree
svm:svr:gamma_start float1.0e-05 min: 0.0 max: 1.0starting point of gamma
svm:svr:gamma_step_size int100  step size point of gamma
svm:svr:gamma_stop float0.1  stopping point of gamma
svm:svr:p_start float1.0e-05  starting point of p
svm:svr:p_step_size int100  step size point of p
svm:svr:p_stop float0.1  stopping point of p
svm:svr:c_start float0.1  starting point of c
svm:svr:c_step_size int100  step size of c
svm:svr:c_stop int1000  stopping point of c
svm:svr:nu_start float0.3 min: 0.0 max: 1.0starting point of nu
svm:svr:nu_step_size int2  step size of nu
svm:svr:nu_stop float0.6 min: 0.0 max: 1.0stopping point of nu

Note:
  • If a section name is documented, the documentation is displayed as tooltip.
  • Advanced parameter names are italic.

This class implements the algorithm used by the homonymous tool which can be used to train models for MS/MS spectrum simulation.
For the primary ion types (y, b) a SVM is trained using the libSVM library.
All important libSVM parameters are accessible as parameters.
Please refer to the libSVM manuals for detailed description of the parameters. Default values are chosen as in the svm-training tool delivered with libSVM.
For the secondary types (a, c, x, z, losses, b2, y2) a simple Bayesian model is used.

Member Typedef Documentation

◆ DescriptorSet

◆ IonType

◆ ObservedIntensMap

typedef std::map<std::pair<IonType, Size>, std::vector<double> > ObservedIntensMap
private

Constructor & Destructor Documentation

◆ SvmTheoreticalSpectrumGeneratorTrainer() [1/2]

Default constructor.

◆ SvmTheoreticalSpectrumGeneratorTrainer() [2/2]

◆ ~SvmTheoreticalSpectrumGeneratorTrainer()

Destructor.

Member Function Documentation

◆ countIntensities_()

void countIntensities_ ( const PeakSpectrum spectrum,
const AASequence annotation,
const IonType type,
std::map< std::pair< IonType, Size >, std::vector< double > > &  observed_intensities,
double  tolerance,
Size  number_of_regions 
)
private

stores the observed intensities for each sector-type combination in a vector

◆ normalizeIntensity()

void normalizeIntensity ( PeakSpectrum S) const

Normalizes the intensity of the peaks in the input data.

◆ operator=()

Assignment operator.

◆ trainModel()

void trainModel ( const PeakMap spectra,
const std::vector< AASequence > &  annotations,
const String filename,
Int  precursor_charge 
)

trains an SVM for each ion_type and stores them in files <filename>_residue_loss_charge.svm

◆ trainSecondaryTypes_()

void trainSecondaryTypes_ ( TextFile info_outfile,
Size  number_of_regions,
Size  number_of_intensity_levels,
ObservedIntensMap observed_intensities,
const std::vector< IonType > &  ion_types,
const std::vector< bool > &  is_primary 
)
private

trains the Bayesian secondary peak types models

◆ writeTrainingFile_()

void writeTrainingFile_ ( std::vector< DescriptorSet > &  training_input,
std::vector< double > &  training_output,
const String filename 
)
protected

Write a training file that can be passed to libsvm command line tools.