OpenMS
SpectrumLookup Class Reference

Helper class for looking up spectra based on different attributes. More...

#include <OpenMS/METADATA/SpectrumLookup.h>

Inheritance diagram for SpectrumLookup:
[legend]
Collaboration diagram for SpectrumLookup:
[legend]

Public Member Functions

 SpectrumLookup ()
 Constructor. More...
 
virtual ~SpectrumLookup ()
 Destructor. More...
 
bool empty () const
 Check if any spectra were set. More...
 
template<typename SpectrumContainer >
void readSpectra (const SpectrumContainer &spectra, const String &scan_regexp=default_scan_regexp)
 Read and index spectra for later look-up. More...
 
Size findByRT (double rt) const
 Look up spectrum by retention time (RT). More...
 
Size findByNativeID (const String &native_id) const
 Look up spectrum by native ID. More...
 
Size findByIndex (Size index, bool count_from_one=false) const
 Look up spectrum by index (position in the vector of spectra). More...
 
Size findByScanNumber (Size scan_number) const
 Look up spectrum by scan number (extracted from the native ID). More...
 
Size findByReference (const String &spectrum_ref) const
 Look up spectrum by reference. More...
 
void addReferenceFormat (const String &regexp)
 Register a possible format for a spectrum reference. More...
 

Static Public Member Functions

static Int extractScanNumber (const String &native_id, const boost::regex &scan_regexp, bool no_error=false)
 Extract the scan number from the native ID of a spectrum. More...
 
static Int extractScanNumber (const String &native_id, const String &native_id_type_accession)
 
static std::string getRegExFromNativeID (const String &id)
 Determine the RegEx string to extract scan/index number from native IDs. Can be used for extractScanNumber. More...
 
static bool isNativeID (const String &id)
 Simple prefix check if a spectrum identifier id is a nativeID from a vendor file. More...
 

Public Attributes

std::vector< boost::regex > reference_formats
 Possible formats of spectrum references, defined as regular expressions. More...
 
double rt_tolerance
 Tolerance for look-up by retention time. More...
 

Static Public Attributes

static const Stringdefault_scan_regexp
 Default regular expression for extracting scan numbers from spectrum native IDs. More...
 

Protected Member Functions

void addEntry_ (Size index, double rt, Int scan_number, const String &native_id)
 Add a look-up entry for a spectrum. More...
 
Size findByRegExpMatch_ (const String &spectrum_ref, const String &regexp, const boost::smatch &match) const
 Look up spectrum by regular expression match. More...
 
void setScanRegExp_ (const String &scan_regexp)
 Set the regular expression for extracting scan numbers from spectrum native IDs. More...
 

Protected Attributes

Size n_spectra_
 Number of spectra. More...
 
boost::regex scan_regexp_
 Regular expression to extract scan numbers. More...
 
std::vector< Stringregexp_name_list_
 Named groups in vector format. More...
 
std::map< double, Sizerts_
 Mapping: RT -> spectrum index. More...
 
std::map< String, Sizeids_
 Mapping: native ID -> spectrum index. More...
 
std::map< Size, Sizescans_
 Mapping: scan number -> spectrum index. More...
 

Static Protected Attributes

static const Stringregexp_names_
 Named groups recognized in regular expression. More...
 

Private Member Functions

 SpectrumLookup (const SpectrumLookup &)
 Copy constructor (not implemented) More...
 
SpectrumLookupoperator= (const SpectrumLookup &)
 Assignment operator (not implemented). More...
 

Detailed Description

Helper class for looking up spectra based on different attributes.

This class provides functions for looking up spectra that are stored in a vector (e.g. MSExperiment::getSpectra()) by index, retention time, native ID, scan number (extracted from the native ID), or by a reference string containing any of the previous information ("spectrum reference").

Spectrum reference formats
Formats for spectrum references are defined by regular expressions, that must contain certain fields (named groups, i.e. "(?<GROUP>...)") referring to usable information. The following named groups are recognized and can be used to look up spectra:
  • INDEX0: spectrum index, i.e. position in the vector of spectra, counting from zero
  • INDEX1: spectrum index, i.e. position in the vector of spectra, counting from one
  • ID: spectrum native ID
  • SCAN: scan number (extracted from the native ID)
  • RT: retention time
For example, if the format of a spectrum reference is "scan=123", where 123 is the scan number, the expression "scan=(?<SCAN>\\d+)" can be used to extract that number, allowing look-up of the corresponding spectrum.
Reference formats are registered via addReferenceFormat(). Several possible formats can be added and will be tried in order by the function findByReference().
See also
SpectrumMetaDataLookup

Constructor & Destructor Documentation

◆ SpectrumLookup() [1/2]

Constructor.

◆ ~SpectrumLookup()

virtual ~SpectrumLookup ( )
virtual

Destructor.

◆ SpectrumLookup() [2/2]

SpectrumLookup ( const SpectrumLookup )
private

Copy constructor (not implemented)

Member Function Documentation

◆ addEntry_()

void addEntry_ ( Size  index,
double  rt,
Int  scan_number,
const String native_id 
)
protected

Add a look-up entry for a spectrum.

Parameters
indexSpectrum index (position in the vector)
rtRetention time
scan_numberScan number
native_idNative ID

◆ addReferenceFormat()

void addReferenceFormat ( const String regexp)

Register a possible format for a spectrum reference.

Parameters
regexpRegular expression defining the format
Exceptions
Exception::IllegalArgumentif regexp does not contain any of the recognized named groups

The regular expression defining the reference format must contain one or more of the recognized named groups defined in SpectrumLookup::regexp_names_.

◆ empty()

bool empty ( ) const

Check if any spectra were set.

◆ extractScanNumber() [1/2]

static Int extractScanNumber ( const String native_id,
const boost::regex &  scan_regexp,
bool  no_error = false 
)
static

Extract the scan number from the native ID of a spectrum.

Parameters
native_idSpectrum native ID
scan_regexpRegular expression to use (must contain the named group "?<SCAN>")
no_errorSuppress the exception on failure
Exceptions
Exception::ParseErrorif the scan number could not be extracted (unless no_error is set)
Returns
Scan number of the spectrum (or -1 on failure to extract)

Referenced by TOPPFLASHDeconv::main_().

◆ extractScanNumber() [2/2]

static Int extractScanNumber ( const String native_id,
const String native_id_type_accession 
)
static

◆ findByIndex()

Size findByIndex ( Size  index,
bool  count_from_one = false 
) const

Look up spectrum by index (position in the vector of spectra).

Parameters
indexIndex to look up
count_from_oneDo indexes start counting at one (default: zero)?
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ findByNativeID()

Size findByNativeID ( const String native_id) const

Look up spectrum by native ID.

Parameters
native_idNative ID to look up
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ findByReference()

Size findByReference ( const String spectrum_ref) const

Look up spectrum by reference.

Parameters
spectrum_refSpectrum reference to parse
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Exception::ParseErrorif the reference could not be parsed (no reference format matched)
Returns
Index of the spectrum that matched

The regular expressions in SpectrumLookup::reference_formats are matched against the spectrum reference in order. The first one that matches is used to look up the spectrum.

◆ findByRegExpMatch_()

Size findByRegExpMatch_ ( const String spectrum_ref,
const String regexp,
const boost::smatch &  match 
) const
protected

Look up spectrum by regular expression match.

Parameters
spectrum_refSpectrum reference that was parsed
regexpRegular expression used for parsing
matchRegular expression match
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ findByRT()

Size findByRT ( double  rt) const

Look up spectrum by retention time (RT).

Parameters
rtRetention time to look up
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

There is a tolerance for matching of RT values defined by SpectrumLookup::rt_tolerance. The spectrum with the closest match within that tolerance is returned (if any).

◆ findByScanNumber()

Size findByScanNumber ( Size  scan_number) const

Look up spectrum by scan number (extracted from the native ID).

Parameters
scan_numberScan number to look up
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ getRegExFromNativeID()

static std::string getRegExFromNativeID ( const String id)
static

Determine the RegEx string to extract scan/index number from native IDs. Can be used for extractScanNumber.

Parameters
native_idRegEx string

◆ isNativeID()

static bool isNativeID ( const String id)
static

Simple prefix check if a spectrum identifier id is a nativeID from a vendor file.

◆ operator=()

SpectrumLookup& operator= ( const SpectrumLookup )
private

Assignment operator (not implemented).

◆ readSpectra()

void readSpectra ( const SpectrumContainer &  spectra,
const String scan_regexp = default_scan_regexp 
)
inline

Read and index spectra for later look-up.

Template Parameters
SpectrumContainerSpectrum container class, must support size and operator[]
Parameters
spectraContainer of spectra
scan_regexpRegular expression for matching scan numbers in spectrum native IDs (must contain the named group "?<SCAN>")
Exceptions
Exception::IllegalArgumentif scan_regexp does not contain "?<SCAN>" (and is not empty)

Spectra are indexed by retention time, native ID and scan number. In all cases it is expected that the value for each spectrum will be unique. Setting scan_regexp to the empty string ("") disables extraction of scan numbers; look-ups by scan number will fail in that case.

References SpectrumSettings::getNativeID(), MSSpectrum::getRT(), and OPENMS_LOG_WARN.

◆ setScanRegExp_()

void setScanRegExp_ ( const String scan_regexp)
protected

Set the regular expression for extracting scan numbers from spectrum native IDs.

Parameters
scan_regexpRegular expression to use (must contain the named group "?<SCAN>")

Member Data Documentation

◆ default_scan_regexp

const String& default_scan_regexp
static

Default regular expression for extracting scan numbers from spectrum native IDs.

◆ ids_

std::map<String, Size> ids_
protected

Mapping: native ID -> spectrum index.

◆ n_spectra_

Size n_spectra_
protected

Number of spectra.

◆ reference_formats

std::vector<boost::regex> reference_formats

Possible formats of spectrum references, defined as regular expressions.

◆ regexp_name_list_

std::vector<String> regexp_name_list_
protected

Named groups in vector format.

◆ regexp_names_

const String& regexp_names_
staticprotected

Named groups recognized in regular expression.

◆ rt_tolerance

double rt_tolerance

Tolerance for look-up by retention time.

◆ rts_

std::map<double, Size> rts_
protected

Mapping: RT -> spectrum index.

◆ scan_regexp_

boost::regex scan_regexp_
protected

Regular expression to extract scan numbers.

◆ scans_

std::map<Size, Size> scans_
protected

Mapping: scan number -> spectrum index.