BALL  1.4.79
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Groups Pages
Protected Attributes | Friends | List of all members
BALL::QSAR::Model Class Referenceabstract

#include <BALL/QSAR/Model.h>

Inheritance diagram for BALL::QSAR::Model:
BALL::QSAR::ClassificationModel BALL::QSAR::RegressionModel BALL::QSAR::BayesModel BALL::QSAR::LDAModel BALL::QSAR::LogitModel BALL::QSAR::SVMModel BALL::QSAR::LinearModel BALL::QSAR::NonLinearModel BALL::QSAR::NBModel BALL::QSAR::SNBModel BALL::QSAR::MLRModel BALL::QSAR::PCRModel BALL::QSAR::PLSModel BALL::QSAR::ALLModel BALL::QSAR::KernelModel BALL::QSAR::RRModel BALL::QSAR::OPLSModel BALL::QSAR::KNNModel BALL::QSAR::GPModel BALL::QSAR::KPCRModel BALL::QSAR::KPLSModel BALL::QSAR::SVRModel BALL::QSAR::LibsvmModel

Public Member Functions

Constructors and Destructors
 Model (const QSARData &q)
 
virtual ~Model ()
 
virtual void operator= (const Model &m)
 

Protected Member Functions

Input and Output. The following methods can be used to implement the functions saveToFile() and readFromFile() in final classes derived from this base-class
void readMatrix (Eigen::MatrixXd &mat, std::ifstream &in, unsigned int lines, unsigned int col)
 
void readVector (Eigen::RowVectorXd &vec, std::ifstream &in, unsigned int no_cells, bool column_vector)
 
void readModelParametersFromFile (std::ifstream &in)
 
void saveModelParametersToFile (std::ofstream &out)
 
virtual void saveDescriptorInformationToFile (std::ofstream &out)
 
virtual void readDescriptorInformationFromFile (std::ifstream &in, int no_descriptors, bool transformation)
 
void readResponseTransformationFromFile (std::ifstream &in, int no_y)
 
void saveResponseTransformationToFile (std::ofstream &out)
 

Protected Attributes

int default_no_opt_steps_
 

Friends

class Validation
 
class RegressionValidation
 
class ClassificationValidation
 
class PCRModel
 
class KPCRModel
 
class FeatureSelection
 

Accessors

void copyData (const Model &m)
 
void copyDescriptorIDs (const Model &m)
 
void readTrainingData ()
 
virtual Eigen::VectorXd predict (const vector< double > &substance, bool transform)=0
 
void deleteDescriptorIDs ()
 
virtual void train ()=0
 
virtual bool optimizeParameters (int, int)
 
bool optimizeParameters (int k)
 
virtual double calculateStdErr ()
 
virtual void setParameters (vector< double > &)
 
virtual vector< doublegetParameters () const
 
std::multiset< unsigned int > * getDescriptorIDs ()
 
void setDataSource (const QSARData *q)
 
virtual void saveToFile (string filename)=0
 
virtual void readFromFile (string filename)=0
 
const Eigen::MatrixXd * getDescriptorMatrix ()
 
const vector< string > * getSubstanceNames ()
 
const vector< string > * getDescriptorNames ()
 
const Eigen::MatrixXd getDescriptorTransformations ()
 
const Eigen::MatrixXd getYTransformations ()
 
const Eigen::MatrixXd * getY ()
 
void setDescriptorIDs (const std::multiset< unsigned int > &sl)
 
const string * getType ()
 
void getUnnormalizedFeatureValue (int compound, int feature, double &return_value)
 
void getUnnormalizedResponseValue (int compound, int response, double &return_value)
 
Eigen::VectorXd getSubstanceVector (const vector< double > &substance, bool transform)
 
Eigen::VectorXd getSubstanceVector (const Eigen::VectorXd &substance, bool transform)
 
void backTransformPrediction (Eigen::VectorXd &pred)
 
void addLambda (Eigen::MatrixXd &matrix, double &lambda)
 
void readDescriptorInformation ()
 

Attributes

const QSARDatadata
 
Validationmodel_val
 
Eigen::MatrixXd descriptor_matrix_
 
vector< string > substance_names_
 
vector< string > descriptor_names_
 
Eigen::MatrixXd descriptor_transformations_
 
Eigen::MatrixXd y_transformations_
 
Eigen::MatrixXd Y_
 
String type_
 
std::multiset< unsigned int > descriptor_IDs_
 

Detailed Description

Definition at line 34 of file Model.h.

Constructor & Destructor Documentation

BALL::QSAR::Model::Model ( const QSARData q)

constructur

Parameters
qQSARData object, from which the data for this model should be taken
virtual BALL::QSAR::Model::~Model ( )
virtual

Member Function Documentation

void BALL::QSAR::Model::addLambda ( Eigen::MatrixXd &  matrix,
double lambda 
)
protected

adds offset lambda to the diagonal of the given matrix

void BALL::QSAR::Model::backTransformPrediction ( Eigen::VectorXd &  pred)
protected

transforms a prediction (obtained by Model.train()) according to the inverse of the transformation(s) of the activity values of the training data

virtual double BALL::QSAR::Model::calculateStdErr ( )
inlinevirtual

Reimplemented in BALL::QSAR::GPModel.

Definition at line 93 of file Model.h.

void BALL::QSAR::Model::copyData ( const Model m)

copies the data (descriptor matrix, names of substances and descriptors) and the IDs of the selected descriptors from m

void BALL::QSAR::Model::copyDescriptorIDs ( const Model m)

copies the IDs of the selected descriptors from m

void BALL::QSAR::Model::deleteDescriptorIDs ( )

removes all entries from descriptor_IDs

std::multiset<unsigned int>* BALL::QSAR::Model::getDescriptorIDs ( )

returns a const pointer to the descriptor IDs of this model

const Eigen::MatrixXd* BALL::QSAR::Model::getDescriptorMatrix ( )

returns a const pointer to the descriptor matrix of this model

const vector<string>* BALL::QSAR::Model::getDescriptorNames ( )

returns a const pointer to the names of the descriptors of this model

const Eigen::MatrixXd BALL::QSAR::Model::getDescriptorTransformations ( )

returns descriptor transformations

virtual vector<double> BALL::QSAR::Model::getParameters ( ) const
virtual
const vector<string>* BALL::QSAR::Model::getSubstanceNames ( )

returns a const pointer to the names of the substances of this model

Eigen::VectorXd BALL::QSAR::Model::getSubstanceVector ( const vector< double > &  substance,
bool  transform 
)
protected

returns a Row-Vector containing only the values for these descriptors, that have been selected for this model

Parameters
substancea vector of all descriptor values for the substance to be predicted
Eigen::VectorXd BALL::QSAR::Model::getSubstanceVector ( const Eigen::VectorXd &  substance,
bool  transform 
)
protected
const string* BALL::QSAR::Model::getType ( )

returns the type of the current model, e.g. "MLR", "PLS", ...

void BALL::QSAR::Model::getUnnormalizedFeatureValue ( int  compound,
int  feature,
double return_value 
)

Fetches the un-normalized value for the specified feature of the desired compound (instance) from the data that this Model currently contains. This method is needed for visualization purposes only.

void BALL::QSAR::Model::getUnnormalizedResponseValue ( int  compound,
int  response,
double return_value 
)

Fetches the un-normalized value for the specified response of the desired compound (instance) from the data that this Model currently contains. This method is needed for visualization purposes only.

const Eigen::MatrixXd* BALL::QSAR::Model::getY ( )

returns a const pointer to the activity values of this model

const Eigen::MatrixXd BALL::QSAR::Model::getYTransformations ( )
virtual void BALL::QSAR::Model::operator= ( const Model m)
virtual

copy constructur; creates a model with the same specifications as the given one (same model and kernel parameters). If the given model has been trained, the training result is copied as well.
Note, that the input data that has been read by m to m.descriptor_matrix_ and m.Y_ is NOT copied to new model, since the input data is not part of the specification of a model. If nevertheless, copying of the input data is desired, use function copyData() (afterwards).

Reimplemented in BALL::QSAR::KernelModel.

virtual bool BALL::QSAR::Model::optimizeParameters ( int  ,
int   
)
inlinevirtual

optimizes parameters (!=number of features) of the model, e.g. no of latente variables in case of PLS model or kernel width in case of automated lazy learning model.
The number of selected features (=descriptors) is NOT changed by this method. Use class FeatureSelection in order to do this.

Returns
1 if parameters were optimized using cross-validation. The best Q2 value is assumed to be saved in ModelValidation.Q2
0 if the model has no parameters to be optimized, so that no cross-validation was done.

Reimplemented in BALL::QSAR::KPLSModel, BALL::QSAR::PLSModel, BALL::QSAR::OPLSModel, BALL::QSAR::ALLModel, and BALL::QSAR::KNNModel.

Definition at line 89 of file Model.h.

bool BALL::QSAR::Model::optimizeParameters ( int  k)
virtual Eigen::VectorXd BALL::QSAR::Model::predict ( const vector< double > &  substance,
bool  transform 
)
pure virtual

Predicts the activities of a given substance

Parameters
substancethe substance which activity is to be predicted in form of a vecor containing the values for all descriptors (if neccessary, relevant descriptors will be selected automatically)
transformdetermines whether the values for each descriptor of the given substance should be transformed before prediction of activity.
If (transform==1): each descriptor value is transformed according to the centering of the respective column of QSARData.descriptor_matrix used to train this model.
If the substance to be predicted is part of the same input data (e.g. same SD-file) as the training data (as is the case during cross validation), transform should therefore be set to 0.
Returns
a RowVector containing one value for each predicted activity

Implemented in BALL::QSAR::KernelModel, BALL::QSAR::GPModel, BALL::QSAR::NBModel, BALL::QSAR::SNBModel, BALL::QSAR::LDAModel, BALL::QSAR::LinearModel, BALL::QSAR::LogitModel, and BALL::QSAR::ALLModel.

void BALL::QSAR::Model::readDescriptorInformation ( )
protected

reads selected descriptors, their names and the information about their transformations (mean and stddev of each descriptor). This function is used after feature selection to read information about the selected features

virtual void BALL::QSAR::Model::readDescriptorInformationFromFile ( std::ifstream &  in,
int  no_descriptors,
bool  transformation 
)
protectedvirtual
virtual void BALL::QSAR::Model::readFromFile ( string  filename)
pure virtual
void BALL::QSAR::Model::readMatrix ( Eigen::MatrixXd &  mat,
std::ifstream &  in,
unsigned int  lines,
unsigned int  col 
)
protected

reconstructs a Eigen::MatrixXd from a given input stream after resizing the given Eigen::MatrixXd as specified

void BALL::QSAR::Model::readModelParametersFromFile ( std::ifstream &  in)
protected
void BALL::QSAR::Model::readResponseTransformationFromFile ( std::ifstream &  in,
int  no_y 
)
protected
void BALL::QSAR::Model::readTrainingData ( )

copies the data for the relevant descriptors from the bound QSARData object into this model and updates Model.descriptor_transformations and Model.y_transformations .
If no explicit feature selection was done, i.e. if descriptor_IDs is emtpy, all data is fetched.
If feature selection was done, i.e. if descriptor_IDs is not empty, only the columns of the relevant descriptors are fetched.

void BALL::QSAR::Model::readVector ( Eigen::RowVectorXd &  vec,
std::ifstream &  in,
unsigned int  no_cells,
bool  column_vector 
)
protected
virtual void BALL::QSAR::Model::saveDescriptorInformationToFile ( std::ofstream &  out)
protectedvirtual

overloaded by class RegressionModel, whose member function can also save coefficients and coefficient-errors

Reimplemented in BALL::QSAR::RegressionModel.

void BALL::QSAR::Model::saveModelParametersToFile ( std::ofstream &  out)
protected
void BALL::QSAR::Model::saveResponseTransformationToFile ( std::ofstream &  out)
protected
virtual void BALL::QSAR::Model::saveToFile ( string  filename)
pure virtual
void BALL::QSAR::Model::setDataSource ( const QSARData q)
void BALL::QSAR::Model::setDescriptorIDs ( const std::multiset< unsigned int > &  sl)

manually specify a set of descriptors

virtual void BALL::QSAR::Model::setParameters ( vector< double > &  )
inlinevirtual
virtual void BALL::QSAR::Model::train ( )
pure virtual

Friends And Related Function Documentation

friend class ClassificationValidation
friend

Definition at line 237 of file Model.h.

friend class FeatureSelection
friend

Definition at line 240 of file Model.h.

friend class KPCRModel
friend

Definition at line 239 of file Model.h.

friend class PCRModel
friend

Definition at line 238 of file Model.h.

friend class RegressionValidation
friend

Definition at line 236 of file Model.h.

friend class Validation
friend

Definition at line 235 of file Model.h.

Member Data Documentation

const QSARData* BALL::QSAR::Model::data

pointer to the input data class for this model

Definition at line 147 of file Model.h.

int BALL::QSAR::Model::default_no_opt_steps_
protected

The default number of steps for model parameter optimization.
It can be adjusted by the different types of models.
Standard default value is 30.

Definition at line 159 of file Model.h.

std::multiset<unsigned int> BALL::QSAR::Model::descriptor_IDs_
protected

list containing the IDs of the selected descriptors (=features); with IDs >= 0
If this list is empty, it is assumed that no feature selection was done, i.e. that all descriptors are to be considered for cross-validation and prediction of activity.
If it is not empty, only the descriptors in this list are used for cross-validation and prediction of activity.

Definition at line 232 of file Model.h.

Eigen::MatrixXd BALL::QSAR::Model::descriptor_matrix_
protected

matrix containing the values of each descriptor for each substance

Definition at line 206 of file Model.h.

vector<string> BALL::QSAR::Model::descriptor_names_
protected

names of all descriptors

Definition at line 212 of file Model.h.

Eigen::MatrixXd BALL::QSAR::Model::descriptor_transformations_
protected

2xm dimensional matrix (m=no of descriptors) containing mean and stddev of each selected descriptor.
The content of this matrix is updated only by Model.readTrainingData()

Definition at line 216 of file Model.h.

Validation* BALL::QSAR::Model::model_val

a ModelValidation object, that is used to validate this model and that will contain the results of the validations

Definition at line 150 of file Model.h.

vector<string> BALL::QSAR::Model::substance_names_
protected

names of all substances

Definition at line 209 of file Model.h.

String BALL::QSAR::Model::type_
protected

The type of model, e.g. "MLR", "GP", ...

Definition at line 227 of file Model.h.

Eigen::MatrixXd BALL::QSAR::Model::Y_
protected

Matrix containing the experimentally determined results (active/non-active) for each substance.
Each column contains the values for one activity.

Definition at line 224 of file Model.h.

Eigen::MatrixXd BALL::QSAR::Model::y_transformations_
protected

2xc dimensional matrix (c=no of activities) containing mean and stddev of each activity.
The content of this matrix is updated only by Model.readTrainingData()

Definition at line 220 of file Model.h.