OpenMS
Loading...
Searching...
No Matches
PSMArrowIO Class Reference

Read and write OpenMS identification data as a parquet bundle (.idparquet). More...

#include <OpenMS/FORMAT/PSMArrowIO.h>

Static Public Member Functions

static bool exportToParquet (const std::vector< ProteinIdentification > &protein_identifications, const PeptideIdentificationList &peptide_identifications, const std::string &dir, bool export_all_psms=true, const ParquetWriteConfig &config=ParquetWriteConfig{})
 Export protein and peptide identifications to an idparquet directory bundle.
 
static bool importFromParquet (const std::string &dir, std::vector< ProteinIdentification > &protein_identifications, PeptideIdentificationList &peptide_identifications)
 Import protein and peptide identifications from an idparquet directory bundle.
 

Detailed Description

Read and write OpenMS identification data as a parquet bundle (.idparquet).

An idparquet bundle is a directory containing four parquet files:

All four files are required for a valid bundle on read.

Member Function Documentation

◆ exportToParquet()

static bool exportToParquet ( const std::vector< ProteinIdentification > &  protein_identifications,
const PeptideIdentificationList peptide_identifications,
const std::string &  dir,
bool  export_all_psms = true,
const ParquetWriteConfig config = ParquetWriteConfig{} 
)
static

Export protein and peptide identifications to an idparquet directory bundle.

Writes (and overwrites) the four canonical files inside dir. Other files in dir are left untouched. If dir does not exist it is created (only dir itself; the parent must already exist). If dir exists as a regular file, returns false.

Parameters
[in]protein_identificationsProtein identifications (run-level metadata, hits, groups, search params)
[in]peptide_identificationsPeptide identifications (PSMs)
[in]dirOutput directory path
[in]export_all_psmsIf true, write all hits per PSM; if false, write best hit per spectrum only (default: true — bundle is intended for lossless round-trip)
[in]configParquet writer configuration
Returns
true on success, false on error (errors are logged)

◆ importFromParquet()

static bool importFromParquet ( const std::string &  dir,
std::vector< ProteinIdentification > &  protein_identifications,
PeptideIdentificationList peptide_identifications 
)
static

Import protein and peptide identifications from an idparquet directory bundle.

All four canonical files (psms.parquet, proteins.parquet, protein_groups.parquet, search_params.parquet) must be present in dir. Missing any one is an error.

Parameters
[in]dirInput directory path
[out]protein_identificationsPopulated from the three protein-side parquet files
[out]peptide_identificationsPopulated from psms.parquet
Returns
true on success, false on error (errors are logged)