OpenMS
ModificationsDB Class Reference

database which holds all residue modifications from UniMod More...

#include <OpenMS/CHEMISTRY/ModificationsDB.h>

Inheritance diagram for ModificationsDB:
[legend]
Collaboration diagram for ModificationsDB:
[legend]

Public Member Functions

Size getNumberOfModifications () const
 Returns the number of modifications read from the unimod.xml file. More...
 
const ResidueModificationgetModification (Size index) const
 Returns the modification with the given index. note: out-of-bounds check is only performed in debug mode. More...
 
void searchModifications (std::set< const ResidueModification * > &mods, const String &mod_name, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY) const
 Collects all modifications which have the given name as synonym. More...
 
const ResidueModificationsearchModification (const ResidueModification &mod_in) const
 Returns a pointer to an exact match of the given modification if present in the DB. More...
 
const ResidueModificationsearchModificationsFast (const String &mod_name, bool &multiple_matches, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY) const
 Returns the modification which has the given name as synonym (fast version) More...
 
const ResidueModificationgetModification (const String &mod_name, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY) const
 Returns the modification with the given name. More...
 
bool has (const String &modification) const
 Returns true if the modification exists. More...
 
const ResidueModificationaddModification (std::unique_ptr< ResidueModification > new_mod)
 Add a new modification to ModificationsDB. If the modification already exists (based on its fullID) it is not added. More...
 
const ResidueModificationaddModification (const ResidueModification &new_mod)
 Add a new modification to ModificationsDB. If the modification already exists (based on its fullID) it is not added. A copy will be made on the heap and added to the ModificationsDB otherwise. More...
 
Size findModificationIndex (const String &mod_name) const
 Returns the index of the modification in the mods_ vector; a unique name must be given. More...
 
void searchModificationsByDiffMonoMass (std::vector< String > &mods, double mass, double max_error, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY)
 Collects all modifications with delta mass inside a tolerance window. More...
 
void searchModificationsByDiffMonoMass (std::vector< const ResidueModification * > &mods, double mass, double max_error, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY)
 
void searchModificationsByDiffMonoMassSorted (std::vector< String > &mods, double mass, double max_error, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY)
 Collects all modifications with delta mass inside a tolerance window and adds them sorted by mass difference. More...
 
void searchModificationsByDiffMonoMassSorted (std::vector< const ResidueModification * > &mods, double mass, double max_error, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY)
 
const ResidueModificationgetBestModificationByDiffMonoMass (double mass, double max_error, const String &residue="", ResidueModification::TermSpecificity term_spec=ResidueModification::NUMBER_OF_TERM_SPECIFICITY)
 Returns the best matching modification for the given delta mass and residue. More...
 
void getAllSearchModifications (std::vector< String > &modifications) const
 Collects all modifications that can be used for identification searches. More...
 
void writeTSV (const String &filename)
 Writes tab separated entries: FullId,FullName,Origin,AA,TerminusSpecificity,DiffMonoMass (including header) to TSV file. More...
 

Static Public Member Functions

static ModificationsDBgetInstance ()
 Returns a pointer to the modifications DB (singleton) More...
 
static ModificationsDBinitializeModificationsDB (OpenMS::String unimod_file="CHEMISTRY/unimod.xml", OpenMS::String custommod_file="CHEMISTRY/custom_mods.xml", OpenMS::String psimod_file="CHEMISTRY/PSI-MOD.obo", OpenMS::String xlmod_file="CHEMISTRY/XLMOD.obo")
 Initializes the modification DB with non-default modification files (can only be done once) More...
 
static bool isInstantiated ()
 Check whether ModificationsDB was instantiated before. More...
 

Protected Member Functions

bool residuesMatch_ (const char residue, const ResidueModification *curr_mod) const
 Helper function to check if a residue matches the origin for a modification. More...
 

Protected Attributes

std::vector< ResidueModification * > mods_
 Stores the modifications. More...
 
std::unordered_map< String, std::set< const ResidueModification * > > modification_names_
 Stores the mappings of (unique) names to the modifications. More...
 

Static Protected Attributes

static bool is_instantiated_
 Stores whether ModificationsDB was instantiated before. More...
 

Private Member Functions

Constructors and Destructors
Parameters
unimod_filePath to the Unimod XML file
psimod_filePath to the PSI-MOD OBO file
xlmod_filePath to the XLMOD OBO file
 ModificationsDB (const OpenMS::String &unimod_file="CHEMISTRY/unimod.xml", const OpenMS::String &custommod_file="CHEMISTRY/custom_mods.xml", const OpenMS::String &psimod_file="CHEMISTRY/PSI-MOD.obo", const OpenMS::String &xlmod_file="CHEMISTRY/XLMOD.obo")
 
 ModificationsDB (const ModificationsDB &residue_db)
 Copy constructor. More...
 
virtual ~ModificationsDB ()
 Destructor. More...
 
Assignment
ModificationsDBoperator= (const ModificationsDB &aa)
 Assignment operator. More...
 
const ResidueModificationaddNewModification_ (const ResidueModification &new_mod)
 Add a new modification to ModificationsDB without checking if it was inside already. More...
 
void readFromOBOFile (const String &filename)
 Adds modifications from a given file in OBO format. More...
 
void readFromUnimodXMLFile (const String &filename)
 Adds modifications from a given file in Unimod XML format. More...
 

Friends

class CrossLinksDB
 
class Residue
 
class AASequence
 

Detailed Description

database which holds all residue modifications from UniMod

This singleton class serves as a storage of the available modifications represented by UniMod (www.unimod.org). The modifications are identified by their name and possibly other IDs from UniMod or the PSI-MOD ontology. Modifications can have different specificities, e.g. they can occur only at the termini, anywhere or only at specific amino acids.

The modifications are defined in share/OpenMS/CHEMISTRY/unimod.xml and in share/OpenMS/CHEMISTRY/PSI-MOD.obo. The unimod file can be directly downloaded from unimod.org and replaced if the modifications change.

To add a new modification, not contained in UniMod, one should follow the way described at the unimod.org website and download the file then from unimod.org. The same can be done to add support for the modifications to search engines, e.g. Mascot.

In some scenarios, it might be useful to define different modification databases. This can be done by providing a path through initializeModificationsDB(), however it is important that this is done before* the first call to getInstance().

Constructor & Destructor Documentation

◆ ModificationsDB() [1/2]

ModificationsDB ( const OpenMS::String unimod_file = "CHEMISTRY/unimod.xml",
const OpenMS::String custommod_file = "CHEMISTRY/custom_mods.xml",
const OpenMS::String psimod_file = "CHEMISTRY/PSI-MOD.obo",
const OpenMS::String xlmod_file = "CHEMISTRY/XLMOD.obo" 
)
explicitprivate

◆ ModificationsDB() [2/2]

ModificationsDB ( const ModificationsDB residue_db)
private

Copy constructor.

◆ ~ModificationsDB()

virtual ~ModificationsDB ( )
privatevirtual

Destructor.

Member Function Documentation

◆ addModification() [1/2]

const ResidueModification* addModification ( const ResidueModification new_mod)

Add a new modification to ModificationsDB. If the modification already exists (based on its fullID) it is not added. A copy will be made on the heap and added to the ModificationsDB otherwise.

Returns
a pointer to the modification in the ModificationDB (which can differ from input if mod was already present).
Parameters
new_modThe new modification object. A copy will be made on the heap and added to the ModificationsDB if not already present.

◆ addModification() [2/2]

const ResidueModification* addModification ( std::unique_ptr< ResidueModification new_mod)

Add a new modification to ModificationsDB. If the modification already exists (based on its fullID) it is not added.

Returns
a pointer to the modification in the ModificationDB (which can differ from input if mod was already present).
Parameters
new_modOwning pointer, which transfers ownership to ModificationsDB (mod might get deleted if already present!)

◆ addNewModification_()

const ResidueModification* addNewModification_ ( const ResidueModification new_mod)
private

Add a new modification to ModificationsDB without checking if it was inside already.

Parameters
new_modA copy will be made on the heap and added to the modification if not already present.

◆ findModificationIndex()

Size findModificationIndex ( const String mod_name) const

Returns the index of the modification in the mods_ vector; a unique name must be given.

return numeric_limits<Size>::max() if not exactly one matching modification was found or no matching residue or modification were found

Exceptions
Exception::ElementNotFoundif not exactly one matching modification was found.

◆ getAllSearchModifications()

void getAllSearchModifications ( std::vector< String > &  modifications) const

Collects all modifications that can be used for identification searches.

◆ getBestModificationByDiffMonoMass()

const ResidueModification* getBestModificationByDiffMonoMass ( double  mass,
double  max_error,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
)

Returns the best matching modification for the given delta mass and residue.

Query the modifications DB to get the best matching modification with the given delta mass at the given residue (NULL pointer means no result, maybe the maximal error tolerance needs to be increased). Possible input for CAM modification would be a delta mass of 57 and a residue of "C".

Note
If there are multiple possible matches with equal masses, it will choose the first match which defaults to the first matching UniMod entry.
Parameters
massThe monoisotopic mass of the residue including the mass of the modification
max_errorThe maximal mass error in the modification search
residueThe residue at which the modifications occurs
term_specOnly modifications with matching term specificity are considered.
Returns
A pointer to the best matching modification (or NULL if none was found)

◆ getInstance()

static ModificationsDB* getInstance ( )
static

Returns a pointer to the modifications DB (singleton)

◆ getModification() [1/2]

const ResidueModification* getModification ( const String mod_name,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
) const

Returns the modification with the given name.

If residue is set, only modifications with matching residue of origin are considered. If term_spec is set, only modifications with matching term specificity are considered.

If more than one matching modification is found, the first one is returned with a warning.

Note
Will never return a null pointer, instead will throw an exceptions.
Exceptions
Exception::ElementNotFoundif no modification named mod_name exists (via searchModifications())
Exception::InvalidValueif no matching modification exists

◆ getModification() [2/2]

const ResidueModification* getModification ( Size  index) const

Returns the modification with the given index. note: out-of-bounds check is only performed in debug mode.

◆ getNumberOfModifications()

Size getNumberOfModifications ( ) const

Returns the number of modifications read from the unimod.xml file.

◆ has()

bool has ( const String modification) const

Returns true if the modification exists.

◆ initializeModificationsDB()

static ModificationsDB* initializeModificationsDB ( OpenMS::String  unimod_file = "CHEMISTRY/unimod.xml",
OpenMS::String  custommod_file = "CHEMISTRY/custom_mods.xml",
OpenMS::String  psimod_file = "CHEMISTRY/PSI-MOD.obo",
OpenMS::String  xlmod_file = "CHEMISTRY/XLMOD.obo" 
)
static

Initializes the modification DB with non-default modification files (can only be done once)

◆ isInstantiated()

static bool isInstantiated ( )
static

Check whether ModificationsDB was instantiated before.

◆ operator=()

ModificationsDB& operator= ( const ModificationsDB aa)
private

Assignment operator.

◆ readFromOBOFile()

void readFromOBOFile ( const String filename)
private

Adds modifications from a given file in OBO format.

Exceptions
Exception::ParseErrorif the file cannot be parsed correctly

◆ readFromUnimodXMLFile()

void readFromUnimodXMLFile ( const String filename)
private

Adds modifications from a given file in Unimod XML format.

◆ residuesMatch_()

bool residuesMatch_ ( const char  residue,
const ResidueModification curr_mod 
) const
protected

Helper function to check if a residue matches the origin for a modification.

Special cases are handled as follows:

  • if the origin of the modification is not "X" (everything), then the residue either needs to match the origin exactly or it must be one of "X", ".", or "?"
  • if the origin of the modification is "X" (can match any amino acid), then any residue should match – except if the modification is user-defined and maps to an unknown amino acid (indicated by "X")

Underlying logic to determine whether a given residue matches the modification: if the modification does not have origin of "X" (everything) then it is sufficient to check that the residue matches the origin

◆ searchModification()

const ResidueModification* searchModification ( const ResidueModification mod_in) const

Returns a pointer to an exact match of the given modification if present in the DB.

This should be used if e.g. only a stack copy of the modification is available but you need a pointer to the modification in the database.

Returns
The matching modification given the constraints. Returns nullptr if no modification exists that is an exact match accoring to the equals operator.

◆ searchModifications()

void searchModifications ( std::set< const ResidueModification * > &  mods,
const String mod_name,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
) const

Collects all modifications which have the given name as synonym.

Todo:
use set as return value. Would be more efficient in pyopenms

If residue is set, only modifications with matching residue of origin are considered. If term_spec is set, only modifications with matching term specificity are considered. The resulting set of modifications will be empty if no modification exists that fulfills the criteria.

◆ searchModificationsByDiffMonoMass() [1/2]

void searchModificationsByDiffMonoMass ( std::vector< const ResidueModification * > &  mods,
double  mass,
double  max_error,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
)

◆ searchModificationsByDiffMonoMass() [2/2]

void searchModificationsByDiffMonoMass ( std::vector< String > &  mods,
double  mass,
double  max_error,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
)

Collects all modifications with delta mass inside a tolerance window.

Warning
This function adds the results in the order of appearance in the DB, not considering proximity in mass. Use searchModificationsByDiffMonoMassSorted for this.

If residue is set, only modifications with matching residue of origin are considered. If term_spec is set, only modifications with matching term specificity are considered.

◆ searchModificationsByDiffMonoMassSorted() [1/2]

void searchModificationsByDiffMonoMassSorted ( std::vector< const ResidueModification * > &  mods,
double  mass,
double  max_error,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
)

◆ searchModificationsByDiffMonoMassSorted() [2/2]

void searchModificationsByDiffMonoMassSorted ( std::vector< String > &  mods,
double  mass,
double  max_error,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
)

Collects all modifications with delta mass inside a tolerance window and adds them sorted by mass difference.

If residue is set, only modifications with matching residue of origin are considered. If term_spec is set, only modifications with matching term specificity are considered.

◆ searchModificationsFast()

const ResidueModification* searchModificationsFast ( const String mod_name,
bool &  multiple_matches,
const String residue = "",
ResidueModification::TermSpecificity  term_spec = ResidueModification::NUMBER_OF_TERM_SPECIFICITY 
) const

Returns the modification which has the given name as synonym (fast version)

Unlike searchModifications(), only returns the one occurrence of the modification (the last occurrence). It is therefore required to check multiple_matches to ensure that only a single modification was found.

If residue is set, only modifications with matching residue of origin are considered. If term_spec is set, only modifications with matching term specificity are considered.

Returns
The matching modification given the constraints. Returns nullptr if no modification exists that fulfills the criteria. If multiple modifications are found, the multiple_matches flag will be set.

◆ writeTSV()

void writeTSV ( const String filename)

Writes tab separated entries: FullId,FullName,Origin,AA,TerminusSpecificity,DiffMonoMass (including header) to TSV file.

Friends And Related Function Documentation

◆ AASequence

friend class AASequence
friend

◆ CrossLinksDB

friend class CrossLinksDB
friend

◆ Residue

friend class Residue
friend

Member Data Documentation

◆ is_instantiated_

bool is_instantiated_
staticprotected

Stores whether ModificationsDB was instantiated before.

◆ modification_names_

std::unordered_map<String, std::set<const ResidueModification*> > modification_names_
protected

Stores the mappings of (unique) names to the modifications.

◆ mods_

std::vector<ResidueModification*> mods_
protected

Stores the modifications.