OpenMS
Loading...
Searching...
No Matches
OnDiscMSExperiment Class Reference

Representation of a mass spectrometry experiment on disk. More...

#include <OpenMS/KERNEL/OnDiscMSExperiment.h>

Collaboration diagram for OnDiscMSExperiment:
[legend]

Public Member Functions

 OnDiscMSExperiment ()=default
 Constructor.
 
bool openFile (const String &filename, bool skipMetaData=false)
 Open a specific file on disk.
 
 OnDiscMSExperiment (const OnDiscMSExperiment &source)
 Copy constructor.
 
bool operator== (const OnDiscMSExperiment &rhs) const
 Equality operator.
 
bool operator!= (const OnDiscMSExperiment &rhs) const
 Inequality operator.
 
bool isSortedByRT () const
 Checks if all spectra are sorted with respect to ascending RT.
 
Size size () const
 alias for getNrSpectra
 
bool empty () const
 returns whether spectra are empty
 
Size getNrSpectra () const
 get the total number of spectra available
 
Size getNrChromatograms () const
 get the total number of chromatograms available
 
std::shared_ptr< const ExperimentalSettingsgetExperimentalSettings () const
 returns the meta information of this experiment (const access)
 
std::shared_ptr< PeakMapgetMetaData () const
 
MSSpectrum operator[] (Size n)
 alias for getSpectrum
 
MSSpectrum getSpectrum (Size id)
 returns a single spectrum
 
OpenMS::Interfaces::SpectrumPtr getSpectrumById (Size id)
 returns a single spectrum (without applying PeakFileOptions filters)
 
MSChromatogram getChromatogram (Size id)
 returns a single chromatogram
 
MSChromatogram getChromatogramByNativeId (const std::string &id)
 returns a single chromatogram
 
MSSpectrum getSpectrumByNativeId (const std::string &id)
 returns a single spectrum
 
OpenMS::Interfaces::ChromatogramPtr getChromatogramById (Size id)
 returns a single chromatogram
 
void setSkipXMLChecks (bool skip)
 sets whether to skip some XML checks and be fast instead
 
PeakFileOptionsgetOptions ()
 Mutable access to the options for loading/storing.
 
const PeakFileOptionsgetOptions () const
 Non-mutable access to the options for loading/storing.
 
void setOptions (const PeakFileOptions &options)
 set options for loading/storing
 

Protected Attributes

String filename_
 The filename of the underlying data file.
 
Internal::IndexedMzMLHandler indexed_mzml_file_
 The index of the underlying data file.
 
std::shared_ptr< PeakMapmeta_ms_experiment_
 The meta-data.
 
std::unordered_map< std::string, Sizechromatograms_native_ids_
 Mapping of chromatogram native ids to offsets.
 
std::unordered_map< std::string, Sizespectra_native_ids_
 Mapping of spectra native ids to offsets.
 
PeakFileOptions options_
 Options for loading / storing.
 

Private Types

typedef ChromatogramPeak ChromatogramPeakT
 
typedef Peak1D PeakT
 

Private Member Functions

OnDiscMSExperimentoperator= (const OnDiscMSExperiment &)
 Private Assignment operator -> we cannot copy file streams in IndexedMzMLHandler.
 
void loadMetaData_ (const String &filename)
 
MSChromatogram getMetaChromatogramById_ (const std::string &id)
 
MSSpectrum getMetaSpectrumById_ (const std::string &id)
 

Detailed Description

Representation of a mass spectrometry experiment on disk.

This class allows random access to spectra and chromatograms in indexed mzML files without loading the entire file into memory.

Filtering with PeakFileOptions

PeakFileOptions can be used to filter data when retrieving spectra/chromatograms:

  • RT range, MS level, and precursor m/z range filters: Checked BEFORE loading peak data (skips I/O)
  • m/z range and intensity filters: Applied AFTER loading peak data
Note
Unlike in-memory loading (FileHandler), where filtered spectra are completely removed from the container, OnDiscMSExperiment preserves all indices. When a spectrum doesn't pass RT range or MS level filters, getSpectrum() returns the spectrum with metadata but no peaks (I/O is skipped). This preserves the index mapping to the file.

Example:

exp.openFile("data.mzML");
exp.getOptions().setMSLevels({2}); // Only want MS2
exp.getOptions().setRTRange(DRange<1>(100, 200));
for (Size i = 0; i < exp.size(); ++i)
{
MSSpectrum s = exp.getSpectrum(i);
if (s.empty()) continue; // Filtered out, no I/O was performed
// Process MS2 spectrum in RT range...
}
A D-dimensional half-open interval.
Definition DRange.h:39
The representation of a 1D spectrum.
Definition MSSpectrum.h:44
Representation of a mass spectrometry experiment on disk.
Definition OnDiscMSExperiment.h:71
MSSpectrum getSpectrum(Size id)
returns a single spectrum
PeakFileOptions & getOptions()
Mutable access to the options for loading/storing.
bool openFile(const String &filename, bool skipMetaData=false)
Open a specific file on disk.
Size size() const
alias for getNrSpectra
Definition OnDiscMSExperiment.h:146
void setRTRange(const DRange< 1 > &range)
restricts the range of RT values for peaks to load
void setMSLevels(const std::vector< Int > &levels)
sets the desired MS levels for peaks to load
size_t Size
Size type e.g. used as variable which can hold result of size()
Definition Types.h:97
Note
This implementation is not thread-safe since it keeps internally a single file access pointer which it moves when accessing a specific data item. Please provide a separate copy to each thread, e.g.
#pragma omp parallel for firstprivate(ondisc_map)

Member Typedef Documentation

◆ ChromatogramPeakT

◆ PeakT

typedef Peak1D PeakT
private

Constructor & Destructor Documentation

◆ OnDiscMSExperiment() [1/2]

OnDiscMSExperiment ( )
default

Constructor.

This initializes the object, use openFile to open a file.

◆ OnDiscMSExperiment() [2/2]

OnDiscMSExperiment ( const OnDiscMSExperiment source)
inline

Copy constructor.

Member Function Documentation

◆ empty()

bool empty ( ) const
inline

returns whether spectra are empty

◆ getChromatogram()

MSChromatogram getChromatogram ( Size  id)

returns a single chromatogram

If PeakFileOptions has RT or intensity range set, the chromatogram will be filtered accordingly.

Parameters
[in]idThe index of the chromatogram

◆ getChromatogramById()

OpenMS::Interfaces::ChromatogramPtr getChromatogramById ( Size  id)

returns a single chromatogram

◆ getChromatogramByNativeId()

MSChromatogram getChromatogramByNativeId ( const std::string &  id)

returns a single chromatogram

Parameters
[in]idThe native identifier of the chromatogram

◆ getExperimentalSettings()

std::shared_ptr< const ExperimentalSettings > getExperimentalSettings ( ) const
inline

returns the meta information of this experiment (const access)

◆ getMetaChromatogramById_()

MSChromatogram getMetaChromatogramById_ ( const std::string &  id)
private

◆ getMetaData()

std::shared_ptr< PeakMap > getMetaData ( ) const
inline

◆ getMetaSpectrumById_()

MSSpectrum getMetaSpectrumById_ ( const std::string &  id)
private

◆ getNrChromatograms()

Size getNrChromatograms ( ) const
inline

get the total number of chromatograms available

◆ getNrSpectra()

Size getNrSpectra ( ) const
inline

get the total number of spectra available

◆ getOptions() [1/2]

PeakFileOptions & getOptions ( )

Mutable access to the options for loading/storing.

◆ getOptions() [2/2]

const PeakFileOptions & getOptions ( ) const

Non-mutable access to the options for loading/storing.

◆ getSpectrum()

MSSpectrum getSpectrum ( Size  id)

returns a single spectrum

If PeakFileOptions has m/z or intensity range set, the spectrum will be filtered accordingly.

Parameters
[in]idThe index of the spectrum

◆ getSpectrumById()

OpenMS::Interfaces::SpectrumPtr getSpectrumById ( Size  id)
inline

returns a single spectrum (without applying PeakFileOptions filters)

◆ getSpectrumByNativeId()

MSSpectrum getSpectrumByNativeId ( const std::string &  id)

returns a single spectrum

Parameters
[in]idThe native identifier of the spectrum

◆ isSortedByRT()

bool isSortedByRT ( ) const
inline

Checks if all spectra are sorted with respect to ascending RT.

Note that we cannot check whether all spectra are sorted (except if we were to load them all and check).

◆ loadMetaData_()

void loadMetaData_ ( const String filename)
private

◆ openFile()

bool openFile ( const String filename,
bool  skipMetaData = false 
)

Open a specific file on disk.

This tries to read the indexed mzML by parsing the index and then reading the meta information into memory.

Returns
Whether the parsing of the file was successful (if false, the file most likely was not an indexed mzML file)

◆ operator!=()

bool operator!= ( const OnDiscMSExperiment rhs) const
inline

Inequality operator.

References OpenMS::Internal::operator==().

◆ operator=()

OnDiscMSExperiment & operator= ( const OnDiscMSExperiment )
private

Private Assignment operator -> we cannot copy file streams in IndexedMzMLHandler.

◆ operator==()

bool operator== ( const OnDiscMSExperiment rhs) const
inline

Equality operator.

This only checks whether the underlying file is the same and the parsed meta-information is the same. Note that the file reader (e.g. the std::ifstream of the file) might be in a different state.

References OnDiscMSExperiment::filename_, and OnDiscMSExperiment::meta_ms_experiment_.

◆ operator[]()

MSSpectrum operator[] ( Size  n)
inline

alias for getSpectrum

◆ setOptions()

void setOptions ( const PeakFileOptions options)

set options for loading/storing

◆ setSkipXMLChecks()

void setSkipXMLChecks ( bool  skip)

sets whether to skip some XML checks and be fast instead

◆ size()

Size size ( ) const
inline

alias for getNrSpectra

Member Data Documentation

◆ chromatograms_native_ids_

std::unordered_map< std::string, Size > chromatograms_native_ids_
protected

Mapping of chromatogram native ids to offsets.

◆ filename_

String filename_
protected

The filename of the underlying data file.

Referenced by OnDiscMSExperiment::operator==().

◆ indexed_mzml_file_

Internal::IndexedMzMLHandler indexed_mzml_file_
protected

The index of the underlying data file.

◆ meta_ms_experiment_

std::shared_ptr<PeakMap> meta_ms_experiment_
protected

The meta-data.

Referenced by OnDiscMSExperiment::operator==().

◆ options_

PeakFileOptions options_
protected

Options for loading / storing.

◆ spectra_native_ids_

std::unordered_map< std::string, Size > spectra_native_ids_
protected

Mapping of spectra native ids to offsets.