OpenMS
Loading...
Searching...
No Matches
TransitionParquetFile Class Reference

Read OpenSwath Parquet library input (.oswpq) into LightTargetedExperiment. More...

#include <OpenMS/ANALYSIS/OPENSWATH/TransitionParquetFile.h>

Public Member Functions

 TransitionParquetFile ()=default
 Default constructor (the class is stateless; API consists of the two convert* methods)
 
 ~TransitionParquetFile ()=default
 Default destructor.
 
void convertParquetToTargetedExperiment (const std::string &oswpq_dir, OpenSwath::LightTargetedExperiment &targeted_exp) const
 Read a .oswpq library and populate a OpenSwath::LightTargetedExperiment.
 
void convertLightTargetedExperimentToParquet (const std::string &oswpq_path, const OpenSwath::LightTargetedExperiment &targeted_exp) const
 Write a OpenSwath::LightTargetedExperiment to a .oswpq library.
 

Detailed Description

Read OpenSwath Parquet library input (.oswpq) into LightTargetedExperiment.

The Parquet library format is a directory container with separate tables for precursors and transitions. The reader materializes all rows into OpenSwath::LightTargetedExperiment.

The container layout is:

<library>.oswpq
└── library/
├── metadata.json
├── precursors.parquet
└── transitions.parquet

The metadata file contains an OpenMS metadata block and QC counts. The new canonical layout (matching OpenSwathOSWParquetWriter) looks like:

{
"openms": {
"schema_version": 1,
"generator": "TransitionParquetFile",
"openms_version": "<version>",
"build_time": "<build_time>",
"tool": {"name": "OpenSwathWorkflow", "version": "<version>"},
"counts": {
"proteins": {"total": 0, "target": 0, "decoy": 0},
"peptides": {"total": 0, "target": 0, "decoy": 0},
"precursors": {"total": 0, "target": 0, "decoy": 0},
"compounds": {"total": 0, "target": 0, "decoy": 0},
"transitions": {"total": 0, "target": 0, "decoy": 0}
},
"fragment_type_counts": {
"target": {"b": 0, "y": 0, "other": 0},
"decoy": {"b": 0, "y": 0, "other": 0}
},
"charge_counts": {
"precursor": {"target": {"2": 0, "3": 0}, "decoy": {"2": 0, "3": 0}},
"transition": {"target": {"1": 0, "2": 0}, "decoy": {"1": 0, "2": 0}}
}
}
}

Required columns for precursors.parquet:

  • precursor_id (int64)
  • precursor_mz (float64)
  • charge (int32)
  • library_rt (float64) Optional columns:
  • library_drift_time (float64)
  • traml_id (string)
  • decoy (bool)
  • modified_sequence (string)
  • unmodified_sequence (string)
  • protein_accessions (string or list<string>)

Required columns for transitions.parquet:

  • transition_id (int64)
  • precursor_id (int64)
  • product_mz (float64)
  • charge (int32)
  • type (string)
  • ordinal (int32)
  • detecting (bool)
  • identifying (bool)
  • quantifying (bool)
  • library_intensity (float64)
  • decoy (bool) Optional columns:
  • traml_id (string)
  • annotation (string)

Constructor & Destructor Documentation

◆ TransitionParquetFile()

TransitionParquetFile ( )
default

Default constructor (the class is stateless; API consists of the two convert* methods)

◆ ~TransitionParquetFile()

~TransitionParquetFile ( )
default

Default destructor.

Member Function Documentation

◆ convertLightTargetedExperimentToParquet()

void convertLightTargetedExperimentToParquet ( const std::string &  oswpq_path,
const OpenSwath::LightTargetedExperiment targeted_exp 
) const

Write a OpenSwath::LightTargetedExperiment to a .oswpq library.

Output target depends on oswpq_path: if it points at an existing directory, the parquet files are written directly under <oswpq_path>/library/. Otherwise the writer stages the layout in a temporary directory, then assembles a zip archive at oswpq_path via a .tmp staging archive that is renamed into place once complete.

Parameters
[in]oswpq_pathDestination — existing directory or zip-file path.
[in]targeted_expLibrary to serialise.
Exceptions
Exception::FileNotWritableIf a generated file inside the staging area cannot be written.
Exception::MissingInformationIf required per-row data (e.g. a transition lacking a peptide ref) is missing from targeted_exp.
Exception::InvalidValueIf row-level invariants are violated (e.g. duplicate precursor ids, schema-incompatible values).

◆ convertParquetToTargetedExperiment()

void convertParquetToTargetedExperiment ( const std::string &  oswpq_dir,
OpenSwath::LightTargetedExperiment targeted_exp 
) const

Read a .oswpq library and populate a OpenSwath::LightTargetedExperiment.

Opens library/precursors.parquet and library/transitions.parquet from oswpq_dir (zip archive or extracted directory), validates each table against the OSWPrecursorSchema / OSWTransitionSchema in subset mode (extra columns are tolerated), and materialises the rows into targeted_exp.

targeted_exp is reset to an empty OpenSwath::LightTargetedExperiment at the start of the call — pre-existing contents are discarded rather than appended to.

Parameters
[in]oswpq_dirPath to a .oswpq archive or to an already-extracted directory containing library/.
[out]targeted_expPopulated targeted experiment; cleared before being filled.
Exceptions
Exception::MissingInformationIf a required parquet entry (precursors / transitions) cannot be located inside oswpq_dir.
Exception::InvalidValueIf a loaded parquet table fails schema validation against the precursor / transition schema, or if any other required row-level field is missing during materialisation.