OpenMS
SpectrumMetaDataLookup Class Reference

Helper class for looking up spectrum meta data. More...

#include <OpenMS/METADATA/SpectrumMetaDataLookup.h>

Inheritance diagram for SpectrumMetaDataLookup:
[legend]
Collaboration diagram for SpectrumMetaDataLookup:
[legend]

Classes

struct  SpectrumMetaData
 Meta data of a spectrum. More...
 

Public Types

typedef unsigned char MetaDataFlags
 Bit mask for which meta data to extract from a spectrum. More...
 

Public Member Functions

 SpectrumMetaDataLookup ()
 Constructor. More...
 
 ~SpectrumMetaDataLookup () override
 Destructor. More...
 
template<typename SpectrumContainer >
void readSpectra (const SpectrumContainer &spectra, const String &scan_regexp=default_scan_regexp, bool get_precursor_rt=false)
 Read spectra and store their meta data. More...
 
void setSpectraDataRef (const String &spectra_data)
 set spectra_data from read SpectrumContainer origin (i.e. filename) More...
 
void getSpectrumMetaData (Size index, SpectrumMetaData &meta) const
 Look up meta data of a spectrum. More...
 
void getSpectrumMetaData (const String &spectrum_ref, SpectrumMetaData &meta, MetaDataFlags flags=MDF_ALL) const
 Extract meta data via a spectrum reference. More...
 
- Public Member Functions inherited from SpectrumLookup
 SpectrumLookup ()
 Constructor. More...
 
virtual ~SpectrumLookup ()
 Destructor. More...
 
bool empty () const
 Check if any spectra were set. More...
 
template<typename SpectrumContainer >
void readSpectra (const SpectrumContainer &spectra, const String &scan_regexp=default_scan_regexp)
 Read and index spectra for later look-up. More...
 
Size findByRT (double rt) const
 Look up spectrum by retention time (RT). More...
 
Size findByNativeID (const String &native_id) const
 Look up spectrum by native ID. More...
 
Size findByIndex (Size index, bool count_from_one=false) const
 Look up spectrum by index (position in the vector of spectra). More...
 
Size findByScanNumber (Size scan_number) const
 Look up spectrum by scan number (extracted from the native ID). More...
 
Size findByReference (const String &spectrum_ref) const
 Look up spectrum by reference. More...
 
void addReferenceFormat (const String &regexp)
 Register a possible format for a spectrum reference. More...
 

Static Public Member Functions

static void getSpectrumMetaData (const MSSpectrum &spectrum, SpectrumMetaData &meta, const boost::regex &scan_regexp=boost::regex(), const std::map< Size, double > &precursor_rts=(std::map< Size, double >()))
 Extract meta data from a spectrum. More...
 
static bool addMissingRTsToPeptideIDs (std::vector< PeptideIdentification > &peptides, const String &filename, bool stop_on_error=false)
 Add missing retention time values to peptide identifications based on raw data. More...
 
static bool addMissingSpectrumReferences (std::vector< PeptideIdentification > &peptides, const String &filename, bool stop_on_error=false, bool override_spectra_data=false, bool override_spectra_references=false, std::vector< ProteinIdentification > proteins=std::vector< ProteinIdentification >())
 Add missing "spectrum_reference"s to peptide identifications based on raw data. More...
 
- Static Public Member Functions inherited from SpectrumLookup
static Int extractScanNumber (const String &native_id, const boost::regex &scan_regexp, bool no_error=false)
 Extract the scan number from the native ID of a spectrum. More...
 
static Int extractScanNumber (const String &native_id, const String &native_id_type_accession)
 
static std::string getRegExFromNativeID (const String &native_id)
 Determine the RegEx string to extract scan/index number from native IDs. Can be used for extractScanNumber. More...
 
static bool isNativeID (const String &id)
 Simple prefix check if a spectrum identifier id is a nativeID from a vendor file. More...
 

Static Public Attributes

static const MetaDataFlags MDF_RT = 1
 
static const MetaDataFlags MDF_PRECURSORRT = 2
 
static const MetaDataFlags MDF_PRECURSORMZ = 4
 
static const MetaDataFlags MDF_PRECURSORCHARGE = 8
 
static const MetaDataFlags MDF_MSLEVEL = 16
 
static const MetaDataFlags MDF_SCANNUMBER = 32
 
static const MetaDataFlags MDF_NATIVEID = 64
 
static const MetaDataFlags MDF_ALL = 127
 
- Static Public Attributes inherited from SpectrumLookup
static const Stringdefault_scan_regexp
 Default regular expression for extracting scan numbers from spectrum native IDs. More...
 

Protected Attributes

std::vector< SpectrumMetaDatametadata_
 Meta data for spectra. More...
 
String spectra_data_ref
 
- Protected Attributes inherited from SpectrumLookup
Size n_spectra_
 Number of spectra. More...
 
boost::regex scan_regexp_
 Regular expression to extract scan numbers. More...
 
std::vector< Stringregexp_name_list_
 Named groups in vector format. More...
 
std::map< double, Sizerts_
 Mapping: RT -> spectrum index. More...
 
std::map< String, Sizeids_
 Mapping: native ID -> spectrum index. More...
 
std::map< Size, Sizescans_
 Mapping: scan number -> spectrum index. More...
 

Private Member Functions

 SpectrumMetaDataLookup (const SpectrumMetaDataLookup &)
 Copy constructor (not implemented) More...
 
SpectrumMetaDataLookupoperator= (const SpectrumMetaDataLookup &)
 Assignment operator (not implemented) More...
 

Additional Inherited Members

- Public Attributes inherited from SpectrumLookup
std::vector< boost::regex > reference_formats
 Possible formats of spectrum references, defined as regular expressions. More...
 
double rt_tolerance
 Tolerance for look-up by retention time. More...
 
- Protected Member Functions inherited from SpectrumLookup
void addEntry_ (Size index, double rt, Int scan_number, const String &native_id)
 Add a look-up entry for a spectrum. More...
 
Size findByRegExpMatch_ (const String &spectrum_ref, const String &regexp, const boost::smatch &match) const
 Look up spectrum by regular expression match. More...
 
void setScanRegExp_ (const String &scan_regexp)
 Set the regular expression for extracting scan numbers from spectrum native IDs. More...
 
- Static Protected Attributes inherited from SpectrumLookup
static const Stringregexp_names_
 Named groups recognized in regular expression. More...
 

Detailed Description

Helper class for looking up spectrum meta data.

The class deals with meta data of spectra and provides functions for the extraction and look-up of this data.

A common use case for this functionality is importing peptide/protein identification results from search engine-specific file formats, where some meta information may have to be looked up in the raw data (primarily retention times). One example of this is in the function addMissingRTsToPeptideIDs().

Meta data of a spectra is stored in SpectrumMetaDataLookup::SpectrumMetaData structures. In order to control which meta data to extract/look-up, flags (SpectrumMetaDataLookup::MetaDataFlags) are used. Meta data can be extracted from spectra or from spectrum reference strings. The format of a spectrum reference is defined via a regular expression containing named groups (format "(?<GROUP>...)" for the different data items. The table below illustrates the different meta data types and how they are represented.

SpectrumMetaData member MetaDataFlags flag Reg. exp. group Comment (*: undefined for MS1 spectra)
rt MDF_RT RT Retention time of the spectrum
precursor_rt MDF_PRECURSORRT PRECRT Retention time of the precursor spectrum*
precursor_mz MDF_PRECURSORMZ MZ Mass-to-charge ratio of the precursor ion*
precursor_charge MDF_PRECURSORCHARGE CHARGE Charge of the precursor ion*
ms_level MDF_MSLEVEL LEVEL MS level (1 for survey scan, 2 for fragment scan, etc.)
scan_number MDF_SCANNUMBER SCAN Scan number (extracted from the native ID)
native_id MDF_NATIVEID ID Native ID of the spectrum
MDF_ALL Shortcut for "all flags set"
INDEX0 Only for look-up: index (vector pos.) counting from 0
INDEX1 Only for look-up: index (vector pos.) counting from 1
See also
OpenMS::SpectrumLookup

Member Typedef Documentation

◆ MetaDataFlags

typedef unsigned char MetaDataFlags

Bit mask for which meta data to extract from a spectrum.

Constructor & Destructor Documentation

◆ SpectrumMetaDataLookup() [1/2]

Constructor.

◆ ~SpectrumMetaDataLookup()

~SpectrumMetaDataLookup ( )
inlineoverride

Destructor.

◆ SpectrumMetaDataLookup() [2/2]

Copy constructor (not implemented)

Member Function Documentation

◆ addMissingRTsToPeptideIDs()

static bool addMissingRTsToPeptideIDs ( std::vector< PeptideIdentification > &  peptides,
const String filename,
bool  stop_on_error = false 
)
static

Add missing retention time values to peptide identifications based on raw data.

Parameters
peptidesPeptide IDs with or without RT values
filenameName of a raw data file (e.g. mzML) for looking up RTs
stop_on_errorStop when an ID could not be matched to a spectrum (or keep going)?
Returns
True if all peptide IDs could be annotated successfully (including if all already had RT values), false otherwise.

Look-up works by matching the "spectrum_reference" (meta value) of a peptide ID to the native ID of a spectrum. Only peptide IDs without RT (where PeptideIdentification::getRT() returns "NaN") are looked up; the RT is set to that of the corresponding spectrum.

◆ addMissingSpectrumReferences()

static bool addMissingSpectrumReferences ( std::vector< PeptideIdentification > &  peptides,
const String filename,
bool  stop_on_error = false,
bool  override_spectra_data = false,
bool  override_spectra_references = false,
std::vector< ProteinIdentification proteins = std::vector< ProteinIdentification >() 
)
static

Add missing "spectrum_reference"s to peptide identifications based on raw data.

Parameters
peptidesPeptide IDs with or without spectrum_reference
filenamethe name of the mz_file from which to draw spectrum_references
stop_on_errorStop when an ID could not be matched to a spectrum (or keep going)?
override_spectra_dataif given ProteinIdentifications should be updated with new "spectra_data" values from SpectrumMetaDataLookup
override_spectra_referencesif given PeptideIdentifications with existing spectrum_reference should be updated from SpectrumMetaDataLookup
proteinsProtein IDs corresponding to the Peptide IDs
Returns
True if all peptide IDs could be annotated successfully (including if all already had "spectrum_reference" values), false otherwise.

Look-up works by matching RT of a peptide identification with the given spectra. Matched spectra 'native ID' will be annotated to the identification. All spectrum_references are updated/added.

◆ getSpectrumMetaData() [1/3]

static void getSpectrumMetaData ( const MSSpectrum spectrum,
SpectrumMetaData meta,
const boost::regex &  scan_regexp = boost::regex(),
const std::map< Size, double > &  precursor_rts = (std::map< Size, double >()) 
)
static

Extract meta data from a spectrum.

Parameters
spectrumSpectrum input
metaMeta data output
scan_regexpRegular expression for extracting scan number from spectrum native ID
precursor_rtsRTs of potential precursor spectra of different MS levels

Scan number and precursor RT, respectively, are only extracted if scan_regexp/precursor_rts are not empty.

◆ getSpectrumMetaData() [2/3]

void getSpectrumMetaData ( const String spectrum_ref,
SpectrumMetaData meta,
MetaDataFlags  flags = MDF_ALL 
) const

Extract meta data via a spectrum reference.

Parameters
spectrum_refSpectrum reference to parse
metaMeta data output
flagsWhat meta data to extract
Exceptions
Exception::ElementNotFoundif a spectrum look-up was necessary, but no matching spectrum was found

This function is a combination of getSpectrumMetaData() and SpectrumLookup::findByReference(). However, the spectrum is only looked up if necessary, i.e. if the required meta data - as defined by flags - cannot be extracted from the spectrum reference itself.

◆ getSpectrumMetaData() [3/3]

void getSpectrumMetaData ( Size  index,
SpectrumMetaData meta 
) const

Look up meta data of a spectrum.

Parameters
indexIndex of the spectrum
metaMeta data output

◆ operator=()

SpectrumMetaDataLookup& operator= ( const SpectrumMetaDataLookup )
private

Assignment operator (not implemented)

◆ readSpectra()

void readSpectra ( const SpectrumContainer &  spectra,
const String scan_regexp = default_scan_regexp,
bool  get_precursor_rt = false 
)
inline

Read spectra and store their meta data.

Template Parameters
SpectrumContainerSpectrum container class, must support size and operator[]
Parameters
spectraContainer of spectra
scan_regexpRegular expression for matching scan numbers in spectrum native IDs (must contain the named group "?<SCAN>")
get_precursor_rtAssign precursor retention times? (This relies on all precursor spectra being present and in the right order.)
Exceptions
Exception::IllegalArgumentif scan_regexp does not contain "?<SCAN>" (and is not empty)

References SpectrumMetaDataLookup::SpectrumMetaData::ms_level, SpectrumMetaDataLookup::SpectrumMetaData::native_id, SpectrumMetaDataLookup::SpectrumMetaData::rt, and SpectrumMetaDataLookup::SpectrumMetaData::scan_number.

◆ setSpectraDataRef()

void setSpectraDataRef ( const String spectra_data)
inline

set spectra_data from read SpectrumContainer origin (i.e. filename)

Parameters
spectra_datathe name (and path) of the origin of the read SpectrumContainer

Member Data Documentation

◆ MDF_ALL

const MetaDataFlags MDF_ALL = 127
static

◆ MDF_MSLEVEL

const MetaDataFlags MDF_MSLEVEL = 16
static

◆ MDF_NATIVEID

const MetaDataFlags MDF_NATIVEID = 64
static

◆ MDF_PRECURSORCHARGE

const MetaDataFlags MDF_PRECURSORCHARGE = 8
static

◆ MDF_PRECURSORMZ

const MetaDataFlags MDF_PRECURSORMZ = 4
static

◆ MDF_PRECURSORRT

const MetaDataFlags MDF_PRECURSORRT = 2
static

◆ MDF_RT

const MetaDataFlags MDF_RT = 1
static

Possible meta data to extract from a spectrum. Note that the static variables need to be put on separate lines due to a compiler bug in VS

◆ MDF_SCANNUMBER

const MetaDataFlags MDF_SCANNUMBER = 32
static

◆ metadata_

std::vector<SpectrumMetaData> metadata_
protected

Meta data for spectra.

◆ spectra_data_ref

String spectra_data_ref
protected