OpenMS
Loading...
Searching...
No Matches
QPXFile Class Reference

Export PSM (Peptide Spectrum Match) data to Apache Arrow format following QPX PSM schema. More...

#include <OpenMS/FORMAT/QPXFile.h>

Static Public Member Functions

static std::shared_ptr< arrow::Table > exportToArrow (const std::vector< ProteinIdentification > &protein_identifications, const PeptideIdentificationList &peptide_identifications, bool export_all_psms=false)
 Export PSMs to Arrow table using PSMSchema for lossless round-trips.
 
static std::shared_ptr< arrow::Table > exportPSMsToQPXArrow (const std::vector< ProteinIdentification > &protein_identifications, const PeptideIdentificationList &peptide_identifications, bool export_all_psms=false)
 Export PSMs to QPX Parquet eXchange format Arrow table (QPXPSMSchema).
 
static bool exportToParquet (const std::vector< ProteinIdentification > &protein_identifications, const PeptideIdentificationList &peptide_identifications, const String &filename, bool export_all_psms=false, const ParquetWriteConfig &config=ParquetWriteConfig{})
 Export PSM data to Parquet file.
 

Detailed Description

Export PSM (Peptide Spectrum Match) data to Apache Arrow format following QPX PSM schema.

This class provides static methods to export PeptideIdentification/ProteinIdentification data to Apache Arrow Tables and Parquet files. The schema follows the QPX (Quantitative Proteomics Exchange) PSM format.

Experimental classes:
This API is experimental and may change in future versions.

Member Function Documentation

◆ exportPSMsToQPXArrow()

static std::shared_ptr< arrow::Table > exportPSMsToQPXArrow ( const std::vector< ProteinIdentification > &  protein_identifications,
const PeptideIdentificationList peptide_identifications,
bool  export_all_psms = false 
)
static

Export PSMs to QPX Parquet eXchange format Arrow table (QPXPSMSchema).

Unlike exportToArrow() which produces a PSMSchema table for lossless round-trips, this method produces a QPXPSMSchema table optimized for cross-tool exchange (quantms format).

Parameters
protein_identificationsProtein identifications (for file name lookup)
peptide_identificationsPeptide identifications to export
export_all_psmsIf true, export all PSM hits; if false, only best hit per spectrum
Returns
Arrow table with QPXPSMSchema columns, or nullptr on failure

◆ exportToArrow()

static std::shared_ptr< arrow::Table > exportToArrow ( const std::vector< ProteinIdentification > &  protein_identifications,
const PeptideIdentificationList peptide_identifications,
bool  export_all_psms = false 
)
static

Export PSMs to Arrow table using PSMSchema for lossless round-trips.

Produces a table with PSMSchema columns (score, score_type, rank, etc.) suitable for FeatureMapArrowIO and ConsensusMapArrowIO round-trips. For QPX exchange format output, use exportPSMsToQPXArrow() instead.

◆ exportToParquet()

static bool exportToParquet ( const std::vector< ProteinIdentification > &  protein_identifications,
const PeptideIdentificationList peptide_identifications,
const String filename,
bool  export_all_psms = false,
const ParquetWriteConfig config = ParquetWriteConfig{} 
)
static

Export PSM data to Parquet file.

Parameters
[in]protein_identificationsVector of protein identifications
[in]peptide_identificationsList of peptide identifications
[in]filenameOutput file path
[in]export_all_psmsIf true, export all hits per spectrum (default: false, only best hit)
[in]configParquet writing options
Returns
true on success, false on error