OpenMS
Loading...
Searching...
No Matches
FIAMSDataProcessor Class Reference

Data processing pipeline for one Flow Injection Analysis MS (FIA-MS) sample. More...

#include <OpenMS/ANALYSIS/ID/FIAMSDataProcessor.h>

Inheritance diagram for FIAMSDataProcessor:
[legend]
Collaboration diagram for FIAMSDataProcessor:
[legend]

Public Member Functions

 FIAMSDataProcessor ()
 Construct with the FIA-MS default parameters.
 
 ~FIAMSDataProcessor () override=default
 Default destructor.
 
 FIAMSDataProcessor (const FIAMSDataProcessor &cp)=default
 Copy constructor.
 
FIAMSDataProcessoroperator= (const FIAMSDataProcessor &fdp)=default
 Copy assignment.
 
bool run (const MSExperiment &experiment, const float n_seconds, OpenMS::MzTab &output, const bool load_cached_spectrum=true)
 Run the full FIA-MS pipeline for one time window and (re)compute the accurate-mass-search result.
 
void cutForTime (const MSExperiment &experiment, const float n_seconds, std::vector< MSSpectrum > &output)
 Append spectra of experiment whose retention time is strictly less than n_seconds to output.
 
MSSpectrum mergeAlongTime (const std::vector< OpenMS::MSSpectrum > &input)
 Sum input across the time axis into a single spectrum using a per-band sliding bin size.
 
MSSpectrum extractPeaks (const MSSpectrum &input)
 Smooth input with the configured SavitzkyGolayFilter and pick peaks via PeakPickerHiRes.
 
FeatureMap convertToFeatureMap (const MSSpectrum &input)
 Wrap input's peaks as Feature objects in a new FeatureMap, tagging the "scan_polarity" meta value.
 
MSSpectrum trackNoise (const MSSpectrum &input)
 Build a parallel "noise level" spectrum: per-peak m/z carried over, intensity replaced by the local noise estimate.
 
void runAccurateMassSearch (FeatureMap &input, OpenMS::MzTab &output)
 Run AccurateMassSearchEngine on input and write the matches into output.
 
const std::vector< float > & getMZs ()
 Return the band centres used by mergeAlongTime.
 
const std::vector< float > & getBinSizes ()
 Return the per-band sliding bin sizes used by mergeAlongTime.
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const std::string &name)
 Constructor with name that is displayed in error messages.
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor.
 
virtual ~DefaultParamHandler ()
 Destructor.
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator.
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator.
 
void setParameters (const Param &param)
 Sets the parameters.
 
const ParamgetParameters () const
 Non-mutable access to the parameters.
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters.
 
const std::string & getName () const
 Non-mutable access to the name.
 
void setName (const std::string &name)
 Mutable access to the name.
 
const std::vector< std::string > & getSubsections () const
 Non-mutable access to the registered subsections.
 

Protected Member Functions

void updateMembers_ () override
 Recompute the per-band mzs_ / bin_sizes_ caches and push the sgf:* parameters into the SavitzkyGolayFilter member.
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor.
 

Private Member Functions

void storeSpectrum_ (const MSSpectrum &input, const std::string &filename)
 Write input to filename as a single-spectrum mzML file via FileHandler.
 

Private Attributes

std::vector< float > mzs_
 Per-band m/z centres consumed by mergeAlongTime; populated by updateMembers_.
 
std::vector< float > bin_sizes_
 Per-band sliding bin sizes parallel to mzs_; populated by updateMembers_.
 
SavitzkyGolayFilter sgfilter_
 Smoothing filter used by extractPeaks; configured from sgf:* parameters.
 
PeakPickerHiRes picker_
 Peak picker used by extractPeaks; configured from the picker's own parameter section.
 

Additional Inherited Members

- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const std::string &key_prefix="")
 Writes all parameters to meta values.
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters.
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes!
 
std::vector< std::string > subsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes!
 
std::string error_name_
 Name that is displayed in error messages during the parameter checking.
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;.
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;.
 

Detailed Description

Data processing pipeline for one Flow Injection Analysis MS (FIA-MS) sample.

FIA-MS omits chromatographic separation; the entire sample is delivered in a short time window. This class consumes one MSExperiment together with a target time cut-off and performs:

  1. truncate the experiment at n_seconds (drops spectra at RT >= n_seconds);
  2. sum the surviving spectra along the time axis with a per-band sliding bin size bin_size[i] = mzs_[i] / (resolution * 4);
  3. smooth the summed spectrum with SavitzkyGolayFilter;
  4. pick peaks with PeakPickerHiRes;
  5. run AccurateMassSearchEngine against the configured database / adduct list and write the result into the caller's MzTab object.

A cached picked spectrum is reused on subsequent runs of the same sample when load_cached_spectrum is true, skipping the cut/merge/pick steps.

Batches of FIA-MS samples are driven by FIAMSScheduler which loops over CSV rows and calls run once per ;-separated time window per row.

The workflow is inspired by Fuhrer et al. (https://pubs.acs.org/doi/10.1021/ac201267k); this is not an exact reimplementation.

Configuration is via the DefaultParamHandler parameter sections "" (root), "db:", "sgf:", and "sne:" — see the constructor for the full default map.

Constructor & Destructor Documentation

◆ FIAMSDataProcessor() [1/2]

Construct with the FIA-MS default parameters.

Registers the following parameters with their defaults: filename = "fiams", dir_output = "", resolution = 120000, polarity in {positive, negative} (default positive), max_mz = 1500, bin_step = 20, db:mapping = ["CHEMISTRY/HMDBMappingFile.tsv"], db:struct = ["CHEMISTRY/HMDB2StructMapping.tsv"], positive_adducts = "CHEMISTRY/PositiveAdducts.tsv" (advanced), negative_adducts = "CHEMISTRY/NegativeAdducts.tsv" (advanced), store_progress in {true, false} (default true), sgf:frame_length = 11, sgf:polynomial_order = 4, sne:window = 10.

◆ ~FIAMSDataProcessor()

~FIAMSDataProcessor ( )
overridedefault

Default destructor.

◆ FIAMSDataProcessor() [2/2]

FIAMSDataProcessor ( const FIAMSDataProcessor cp)
default

Copy constructor.

Member Function Documentation

◆ convertToFeatureMap()

FeatureMap convertToFeatureMap ( const MSSpectrum input)

Wrap input's peaks as Feature objects in a new FeatureMap, tagging the "scan_polarity" meta value.

For every peak in input, emits one Feature with the peak's m/z and intensity and a "scan_polarity" meta value taken from the polarity parameter ("positive" or "negative"). The returned map is unsorted.

Parameters
[in]inputPicked spectrum.
Returns
FeatureMap with one feature per input peak; scan_polarity meta value set on every feature.

◆ cutForTime()

void cutForTime ( const MSExperiment experiment,
const float  n_seconds,
std::vector< MSSpectrum > &  output 
)

Append spectra of experiment whose retention time is strictly less than n_seconds to output.

output is appended to (existing entries are preserved). The RT comparison is strict (< ), so spectra at exactly n_seconds are dropped.

Parameters
[in]experimentSource experiment.
[in]n_secondsUpper retention-time bound (exclusive).
[in,out]outputReceives the retained spectra.

◆ extractPeaks()

MSSpectrum extractPeaks ( const MSSpectrum input)

Smooth input with the configured SavitzkyGolayFilter and pick peaks via PeakPickerHiRes.

The filter and picker are class members configured through sgf:* and the picker's own parameter section; their state is set up by the constructor and refreshed by updateMembers_ when sgf:* parameters change. input is not modified — a local copy is filtered in place before picking.

Parameters
[in]inputSmoothed/picked input spectrum.
Returns
Spectrum of picked peaks.

◆ getBinSizes()

const std::vector< float > & getBinSizes ( )

Return the per-band sliding bin sizes used by mergeAlongTime.

Each bin size is computed as bin_size[i] = mzs_[i] / (resolution * 4).

Returns
Const reference to the vector of sliding bin sizes, parallel to getMZs.

◆ getMZs()

const std::vector< float > & getMZs ( )

Return the band centres used by mergeAlongTime.

The list contains {bin_step, 2*bin_step, ..., (n-1)*bin_step} where n = max_mz / bin_step.

Returns
Const reference to the vector of band centres. Refreshed by updateMembers_ when parameters change.

◆ mergeAlongTime()

MSSpectrum mergeAlongTime ( const std::vector< OpenMS::MSSpectrum > &  input)

Sum input across the time axis into a single spectrum using a per-band sliding bin size.

For each consecutive pair (mzs_[i], mzs_[i+1]), calls SpectrumAddition::addUpSpectra(input, bin_sizes_[i], false) and keeps only the peaks whose m/z falls in [mzs_[i], mzs_[i+1]). Iteration stops one short of the end (i < mzs_.size() - 1), so the upper-most band is not emitted — configure max_mz to extend past your highest expected peak. The returned spectrum is sorted by m/z position.

Parameters
[in]inputSpectra to sum (typically the output of cutForTime).
Returns
Single summed spectrum, sorted by m/z.

◆ operator=()

FIAMSDataProcessor & operator= ( const FIAMSDataProcessor fdp)
default

Copy assignment.

◆ run()

bool run ( const MSExperiment experiment,
const float  n_seconds,
OpenMS::MzTab output,
const bool  load_cached_spectrum = true 
)

Run the full FIA-MS pipeline for one time window and (re)compute the accurate-mass-search result.

Pipeline:

  1. Compute the cached picked-spectrum path as {dir_output}/{filename}_picked_{int(n_seconds)}.mzML.
  2. If load_cached_spectrum is true and the path exists, load it via FileHandler with MZML pinned and take its first spectrum as the picked spectrum (skip steps 3-5).
  3. Otherwise: call cutForTimemergeAlongTimeextractPeaks.
  4. If store_progress is "true" (and we recomputed in step 3), store the merged spectrum to {dir_output}/{filename}_merged_{int(n_seconds)}.mzML and the picked spectrum to the cached path.
  5. Estimate signal-to-noise via trackNoise and convert the picked spectrum to a FeatureMap via convertToFeatureMap.
  6. Always store the signal-to-noise spectrum to {dir_output}/{filename}_signal_to_noise_{int(n_seconds)}.mzML — note that this is independent of the store_progress setting.
  7. Call runAccurateMassSearch and write {dir_output}/{filename}_{int(n_seconds)}.mzTab via MzTabFile::store.

Progress is logged via OPENMS_LOG_INFO (cache load start/finish or calculation start/finish).

Parameters
[in]experimentInput MS experiment (centroided MS1 is assumed).
[in]n_secondsTime cut-off (seconds). Cast to int when forming the per-run filename suffix, so sub-second offsets collapse.
[out]outputReceives the accurate-mass-search results from runAccurateMassSearch.
[in]load_cached_spectrumIf true and the cached picked-spectrum file exists, reuse it instead of re-running the cut / merge / pick steps. Default true.
Returns
true when the picked spectrum was loaded from the cached file, false when it was recomputed.

◆ runAccurateMassSearch()

void runAccurateMassSearch ( FeatureMap input,
OpenMS::MzTab output 
)

Run AccurateMassSearchEngine on input and write the matches into output.

Configures AccurateMassSearchEngine with:

  • ionization_mode = "auto" (polarity is detected from the feature map, not taken from the class's polarity parameter);
  • mass_error_value = 1e6 / (resolution * 2) ppm;
  • db:mapping / db:struct / positive_adducts / negative_adducts forwarded from the class parameters;
  • keep_unidentified_masses = "false" — only identified masses are reported. Then calls init followed by run.
Parameters
[in,out]inputFeature map to search; consumed by AccurateMassSearchEngine::run.
[out]outputAccurate-mass-search results in MzTab form.

◆ storeSpectrum_()

void storeSpectrum_ ( const MSSpectrum input,
const std::string &  filename 
)
private

Write input to filename as a single-spectrum mzML file via FileHandler.

Used by run to persist the merged spectrum, picked spectrum, and signal-to-noise spectrum.

Parameters
[in]inputSpectrum to store.
[in]filenameOutput path for the mzML file.
Exceptions
ExceptionPropagates exceptions from FileHandler::storeExperiment on I/O failure (no in-class guard).

◆ trackNoise()

MSSpectrum trackNoise ( const MSSpectrum input)

Build a parallel "noise level" spectrum: per-peak m/z carried over, intensity replaced by the local noise estimate.

Wraps SignalToNoiseEstimatorMedianRapid with window sne:window. Returns an empty spectrum when input is empty. The output's intensity field holds the noise estimate at the corresponding m/z (not a S/N ratio).

Parameters
[in]inputPicked spectrum.
Returns
Spectrum with the same m/z values as input, intensities replaced by per-peak noise estimates.

◆ updateMembers_()

void updateMembers_ ( )
overrideprotectedvirtual

Recompute the per-band mzs_ / bin_sizes_ caches and push the sgf:* parameters into the SavitzkyGolayFilter member.

Invoked by DefaultParamHandler when any parameter changes (notably max_mz, bin_step, resolution, sgf:frame_length, sgf:polynomial_order).

Reimplemented from DefaultParamHandler.

Member Data Documentation

◆ bin_sizes_

std::vector<float> bin_sizes_
private

Per-band sliding bin sizes parallel to mzs_; populated by updateMembers_.

◆ mzs_

std::vector<float> mzs_
private

Per-band m/z centres consumed by mergeAlongTime; populated by updateMembers_.

◆ picker_

PeakPickerHiRes picker_
private

Peak picker used by extractPeaks; configured from the picker's own parameter section.

◆ sgfilter_

SavitzkyGolayFilter sgfilter_
private

Smoothing filter used by extractPeaks; configured from sgf:* parameters.