OpenMS
Loading...
Searching...
No Matches
SpectrumNativeIDParser Class Reference

Parser for extracting scan numbers from spectrum native IDs. More...

#include <OpenMS/METADATA/SpectrumNativeIDParser.h>

Static Public Member Functions

static Int extractScanNumber (const String &native_id, const boost::regex &scan_regexp, bool no_error=false)
 Extract the scan number from the native ID of a spectrum using a regular expression.
 
static Int extractScanNumber (const String &native_id, const String &native_id_type_accession)
 Extract the scan number from the native ID using a CV accession.
 
static std::string getRegExFromNativeID (const String &native_id)
 Determine the regular expression to extract scan/index numbers from native IDs.
 
static bool isNativeID (const String &id)
 Check if a spectrum identifier is a native ID from a vendor file.
 

Detailed Description

Parser for extracting scan numbers from spectrum native IDs.

This class provides static functions for parsing native ID strings from various mass spectrometry file formats. Native IDs are vendor-specific identifiers that encode information such as scan numbers, file indices, and experiment numbers.

Supported Native ID Formats

The parser supports the following native ID formats based on CV (Controlled Vocabulary) accessions from the PSI-MS ontology:

CV Accession Vendor/Format Native ID Pattern Example
MS:1000768 Thermo scan=NUMBER scan=42
MS:1000769 Waters function=X process=Y scan=NUMBER function=2 process=1 scan=100
MS:1000770 WIFF (AB Sciex) sample=X period=Y cycle=Z experiment=W sample=1 period=1 cycle=42 experiment=1
MS:1000771 Bruker/Agilent scan=NUMBER scan=42
MS:1000772 Bruker BAF scan=NUMBER scan=42
MS:1000773 Bruker FID file=NUMBER file=42
MS:1000774 Index-based index=NUMBER index=42 (returns 43)
MS:1000775 Single peak list file=NUMBER file=42
MS:1000776 Thermo/Bruker TDF scan=NUMBER scan=42
MS:1000777 Generic spectrum spectrum=NUMBER spectrum=42
MS:1001508 Agilent MassHunter scanId=NUMBER scanId=42
MS:1001530 Numeric-only NUMBER 42
Usage Examples
// Extract scan number using CV accession
Int scan = SpectrumNativeIDParser::extractScanNumber("scan=42", "MS:1000768"); // returns 42
// Get regex pattern from native ID format
String regex = SpectrumNativeIDParser::getRegExFromNativeID("scan=123"); // returns "scan=(?<GROUP>\d+)"
// Check if string is a native ID
bool is_native = SpectrumNativeIDParser::isNativeID("scan=123"); // returns true
static Int extractScanNumber(const String &native_id, const boost::regex &scan_regexp, bool no_error=false)
Extract the scan number from the native ID of a spectrum using a regular expression.
static bool isNativeID(const String &id)
Check if a spectrum identifier is a native ID from a vendor file.
static std::string getRegExFromNativeID(const String &native_id)
Determine the regular expression to extract scan/index numbers from native IDs.
A more convenient string class.
Definition String.h:34
int Int
Signed integer type.
Definition Types.h:72
See also
SpectrumLookup

Member Function Documentation

◆ extractScanNumber() [1/2]

static Int extractScanNumber ( const String native_id,
const boost::regex &  scan_regexp,
bool  no_error = false 
)
static

Extract the scan number from the native ID of a spectrum using a regular expression.

Parameters
[in]native_idSpectrum native ID string
[in]scan_regexpRegular expression to use (must contain the named group "?<SCAN>")
[in]no_errorSuppress the exception on failure and return -1 instead
Exceptions
Exception::ParseErrorif the scan number could not be extracted (unless no_error is set)
Returns
Scan number of the spectrum (or -1 on failure to extract)
Note
The regular expression must contain a capture group, and the last matching subgroup is used as the scan number.

◆ extractScanNumber() [2/2]

static Int extractScanNumber ( const String native_id,
const String native_id_type_accession 
)
static

Extract the scan number from the native ID using a CV accession.

Parameters
[in]native_idSpectrum native ID string
[in]native_id_type_accessionCV accession specifying the native ID format (e.g., "MS:1000768" for Thermo, "MS:1000770" for WIFF)
Returns
Scan number of the spectrum (or -1 on failure to extract)
Note
For WIFF files (MS:1000770), the return value is computed as cycle * 1000 + experiment.
For index-based native IDs (MS:1000774), the return value is index + 1 for pepXML compatibility.

◆ getRegExFromNativeID()

static std::string getRegExFromNativeID ( const String native_id)
static

Determine the regular expression to extract scan/index numbers from native IDs.

Parameters
[in]native_idA native ID string to analyze
Returns
Regular expression string with named group (?<GROUP>\d+) that matches the scan or index number in the native ID

This function examines the prefix of the native ID to determine the appropriate regular expression pattern:

  • scan=, controllerType=, function=scan=(?<GROUP>\d+)
  • index=index=(?<GROUP>\d+)
  • scanId=, scanID=scanId=(?<GROUP>\d+) or scanID=(?<GROUP>\d+)
  • spectrum=spectrum=(?<GROUP>\d+)
  • file=file=(?<GROUP>\d+)
  • Plain number → (?<GROUP>\d+)

◆ isNativeID()

static bool isNativeID ( const String id)
static

Check if a spectrum identifier is a native ID from a vendor file.

Parameters
[in]idSpectrum identifier string to check
Returns
True if the string matches a known native ID prefix pattern

Recognized prefixes: scan=, scanId=, scanID=, controllerType=, function=, sample=, index=, spectrum=, file=