OpenMS
Loading...
Searching...
No Matches
FeatureMapArrowIO Class Reference

Import and export FeatureMap data to/from Apache Arrow format. More...

#include <OpenMS/FORMAT/FeatureMapArrowIO.h>

Static Public Member Functions

static std::shared_ptr< arrow::Table > exportFeaturesToArrow (const FeatureMap &feature_map)
 Export features to Apache Arrow Table.
 
static std::shared_ptr< arrow::Table > exportPSMsToArrow (const FeatureMap &feature_map)
 Export peptide spectrum matches (PSMs) associated with features to Apache Arrow Table.
 
static bool exportToParquet (const FeatureMap &feature_map, const String &directory, const ParquetWriteConfig &config=ParquetWriteConfig{})
 Export FeatureMap to a directory of Parquet files.
 
static bool importFeaturesFromArrow (const std::shared_ptr< arrow::Table > &table, FeatureMap &feature_map)
 Import features from Apache Arrow Table.
 
static bool importPSMsFromArrow (const std::shared_ptr< arrow::Table > &table, FeatureMap &feature_map)
 Import PSMs from Apache Arrow Table.
 
static bool importFromParquet (const String &directory, FeatureMap &feature_map)
 Import FeatureMap from a directory of Parquet files.
 

Detailed Description

Import and export FeatureMap data to/from Apache Arrow format.

This class provides static methods to export and import FeatureMap data to/from Apache Arrow Tables and Parquet files. Separate tables are provided for features (with their geometry and metadata) and for peptide spectrum matches (PSMs) associated with features.

Experimental classes:
This API is experimental and may change in future versions.

Member Function Documentation

◆ exportFeaturesToArrow()

static std::shared_ptr< arrow::Table > exportFeaturesToArrow ( const FeatureMap feature_map)
static

Export features to Apache Arrow Table.

Each Feature becomes one row with RT, MZ, intensity, charge, quality, convex hull geometry, and metadata columns.

Parameters
[in]feature_mapThe FeatureMap to export
Returns
Shared pointer to Arrow Table, or nullptr on error

◆ exportPSMsToArrow()

static std::shared_ptr< arrow::Table > exportPSMsToArrow ( const FeatureMap feature_map)
static

Export peptide spectrum matches (PSMs) associated with features to Apache Arrow Table.

Each PeptideHit from each PeptideIdentification (both feature-level and unassigned) becomes one row.

Parameters
[in]feature_mapThe FeatureMap whose identifications to export
Returns
Shared pointer to Arrow Table, or nullptr on error

◆ exportToParquet()

static bool exportToParquet ( const FeatureMap feature_map,
const String directory,
const ParquetWriteConfig config = ParquetWriteConfig{} 
)
static

Export FeatureMap to a directory of Parquet files.

Writes five Parquet files: features.parquet, psms.parquet, proteins.parquet, protein_groups.parquet, and search_params.parquet into the specified directory. Protein-level data is delegated to ProteinIdentificationArrowIO. FeatureMap-level metadata (DocumentIdentifier, DataProcessing) is stored as file-level key-value metadata in features.parquet.

Parameters
[in]feature_mapThe FeatureMap to export
[in]directoryOutput directory path
[in]configParquet writing options
Returns
true on success, false on error

◆ importFeaturesFromArrow()

static bool importFeaturesFromArrow ( const std::shared_ptr< arrow::Table > &  table,
FeatureMap feature_map 
)
static

Import features from Apache Arrow Table.

Each row becomes a Feature with RT, MZ, intensity, charge, quality, convex hull geometry, and metadata populated.

Parameters
[in]tableArrow Table with feature data
[out]feature_mapFeatureMap to populate
Returns
true on success, false on error

◆ importFromParquet()

static bool importFromParquet ( const String directory,
FeatureMap feature_map 
)
static

Import FeatureMap from a directory of Parquet files.

Reads five Parquet files (features.parquet, psms.parquet, proteins.parquet, protein_groups.parquet, search_params.parquet) from the specified directory and reconstructs a complete FeatureMap including feature hierarchy, PSM linkage, protein identifications, and FeatureMap-level metadata.

Parameters
[in]directoryInput directory path containing Parquet files
[out]feature_mapFeatureMap to populate
Returns
true on success, false on error

◆ importPSMsFromArrow()

static bool importPSMsFromArrow ( const std::shared_ptr< arrow::Table > &  table,
FeatureMap feature_map 
)
static

Import PSMs from Apache Arrow Table.

Reconstructs PeptideIdentifications and PeptideHits from the table and assigns them to the appropriate features or as unassigned.

Parameters
[in]tableArrow Table with PSM data
[out]feature_mapFeatureMap to populate
Returns
true on success, false on error