OpenMS
Loading...
Searching...
No Matches
ConsensusMapArrowIO Class Reference

Import and export ConsensusMap data to/from Apache Arrow format. More...

#include <OpenMS/FORMAT/ConsensusMapArrowIO.h>

Static Public Member Functions

static std::shared_ptr< arrow::Table > exportFeaturesToArrow (const ConsensusMap &cmap)
 Export consensus features to Apache Arrow Table.
 
static std::shared_ptr< arrow::Table > exportPSMsToArrow (const ConsensusMap &cmap)
 Export peptide spectrum matches (PSMs) associated with consensus features to Apache Arrow Table.
 
static bool exportToParquet (const ConsensusMap &cmap, const String &directory, const ParquetWriteConfig &config=ParquetWriteConfig{})
 Export ConsensusMap to a directory of Parquet files.
 
static bool importFeaturesFromArrow (const std::shared_ptr< arrow::Table > &table, ConsensusMap &cmap)
 Import consensus features from Apache Arrow Table.
 
static bool importPSMsFromArrow (const std::shared_ptr< arrow::Table > &table, ConsensusMap &cmap)
 Import PSMs from Apache Arrow Table.
 
static bool importFromParquet (const String &directory, ConsensusMap &cmap)
 Import ConsensusMap from a directory of Parquet files.
 

Detailed Description

Import and export ConsensusMap data to/from Apache Arrow format.

This class provides static methods to export and import ConsensusMap data to/from Apache Arrow Tables and Parquet files. Separate tables are provided for consensus features (with their handles and metadata) and for peptide spectrum matches (PSMs) associated with features.

Experimental classes:
This API is experimental and may change in future versions.

Member Function Documentation

◆ exportFeaturesToArrow()

static std::shared_ptr< arrow::Table > exportFeaturesToArrow ( const ConsensusMap cmap)
static

Export consensus features to Apache Arrow Table.

Each ConsensusFeature becomes one row with RT, MZ, intensity, charge, quality, width, nested FeatureHandles, and metadata columns.

Parameters
[in]cmapThe ConsensusMap to export
Returns
Shared pointer to Arrow Table, or nullptr on error

◆ exportPSMsToArrow()

static std::shared_ptr< arrow::Table > exportPSMsToArrow ( const ConsensusMap cmap)
static

Export peptide spectrum matches (PSMs) associated with consensus features to Apache Arrow Table.

Each PeptideHit from each PeptideIdentification (both feature-level and unassigned) becomes one row.

Parameters
[in]cmapThe ConsensusMap whose identifications to export
Returns
Shared pointer to Arrow Table, or nullptr on error

◆ exportToParquet()

static bool exportToParquet ( const ConsensusMap cmap,
const String directory,
const ParquetWriteConfig config = ParquetWriteConfig{} 
)
static

Export ConsensusMap to a directory of Parquet files.

Writes five Parquet files: consensus_features.parquet, psms.parquet, proteins.parquet, protein_groups.parquet, and search_params.parquet into the specified directory. Protein-level data is delegated to ProteinIdentificationArrowIO. ConsensusMap-level metadata (column headers, experiment type, DocumentIdentifier, DataProcessing) is stored as file-level key-value metadata in consensus_features.parquet.

Parameters
[in]cmapThe ConsensusMap to export
[in]directoryOutput directory path
[in]configParquet writing options
Returns
true on success, false on error

◆ importFeaturesFromArrow()

static bool importFeaturesFromArrow ( const std::shared_ptr< arrow::Table > &  table,
ConsensusMap cmap 
)
static

Import consensus features from Apache Arrow Table.

Each row becomes a ConsensusFeature with RT, MZ, intensity, charge, quality, width, FeatureHandles, and metadata populated.

Parameters
[in]tableArrow Table with consensus feature data
[out]cmapConsensusMap to populate
Returns
true on success, false on error

◆ importFromParquet()

static bool importFromParquet ( const String directory,
ConsensusMap cmap 
)
static

Import ConsensusMap from a directory of Parquet files.

Reads five Parquet files (consensus_features.parquet, psms.parquet, proteins.parquet, protein_groups.parquet, search_params.parquet) from the specified directory and reconstructs a complete ConsensusMap including FeatureHandles, PSM linkage, protein identifications, and ConsensusMap-level metadata.

Parameters
[in]directoryInput directory path containing Parquet files
[out]cmapConsensusMap to populate
Returns
true on success, false on error

◆ importPSMsFromArrow()

static bool importPSMsFromArrow ( const std::shared_ptr< arrow::Table > &  table,
ConsensusMap cmap 
)
static

Import PSMs from Apache Arrow Table.

Reconstructs PeptideIdentifications and PeptideHits from the table and assigns them to the appropriate consensus features or as unassigned.

Parameters
[in]tableArrow Table with PSM data
[out]cmapConsensusMap to populate
Returns
true on success, false on error