OpenMS
MSNumpressCoder Class Reference

Class to encode and decode data encoded with MSNumpress. More...

#include <OpenMS/FORMAT/MSNumpressCoder.h>

Collaboration diagram for MSNumpressCoder:
[legend]

Classes

struct  NumpressConfig
 Configuration class for MSNumpress. More...
 

Public Types

enum  NumpressCompression {
  NONE , LINEAR , PIC , SLOF ,
  SIZE_OF_NUMPRESSCOMPRESSION
}
 Names of compression schemes. More...
 

Public Member Functions

 MSNumpressCoder ()
 default constructor More...
 
virtual ~MSNumpressCoder ()
 Destructor. More...
 
void encodeNP (const std::vector< double > &in, String &result, bool zlib_compression, const NumpressConfig &config)
 Encodes a vector of floating point numbers into a Base64 string using numpress. More...
 
void encodeNP (const std::vector< float > &in, String &result, bool zlib_compression, const NumpressConfig &config)
 encodeNP from a float (convert first to double) More...
 
void decodeNP (const String &in, std::vector< double > &out, bool zlib_compression, const NumpressConfig &config)
 Decodes a Base64 string to a vector of floating point numbers using numpress. More...
 
void encodeNPRaw (const std::vector< double > &in, String &result, const NumpressConfig &config)
 Encode the data vector "in" to a raw byte array. More...
 
void decodeNPRaw (const std::string &in, std::vector< double > &out, const NumpressConfig &config)
 Decode the raw byte array "in" to the result vector "out". More...
 

Static Public Attributes

static const std::string NamesOfNumpressCompression [SIZE_OF_NUMPRESSCOMPRESSION]
 

Private Member Functions

void decodeNPInternal_ (const unsigned char *in, size_t in_size, std::vector< double > &out, const NumpressConfig &config)
 

Detailed Description

Class to encode and decode data encoded with MSNumpress.

MSNumpress supports three encoding schemata:

  • Linear (MS:1002312, MS-Numpress linear prediction compression)
  • Pic (MS:1002313, MS-Numpress positive integer compression)
  • Slof (MS:1002314, MS-Numpress short logged float compression)

Note that the linear compression scheme only makes sense for monotonically increasing data (such as retention time and m/z) that is often equally spaced. Pic compression only makes sense for positive integers as all data will be rounded to the nearest integer. Slof makes sense for all other data (such as non-integer intensity values).

For more information on the compression schemata, see

Teleman J et al, "Numerical compression schemes for proteomics mass spectrometry data." Mol Cell Proteomics. 2014 Jun;13(6):1537-42. doi: 10.1074/mcp.O114.037879.

Member Enumeration Documentation

◆ NumpressCompression

Names of compression schemes.

Enumerator
NONE 

No compression is applied.

LINEAR 

Linear (MS:1002312, MS-Numpress linear prediction compression)

PIC 

Pic (MS:1002313, MS-Numpress positive integer compression)

SLOF 

Slof (MS:1002314, MS-Numpress short logged float compression)

SIZE_OF_NUMPRESSCOMPRESSION 

Constructor & Destructor Documentation

◆ MSNumpressCoder()

MSNumpressCoder ( )
inline

default constructor

◆ ~MSNumpressCoder()

virtual ~MSNumpressCoder ( )
inlinevirtual

Destructor.

Member Function Documentation

◆ decodeNP()

void decodeNP ( const String in,
std::vector< double > &  out,
bool  zlib_compression,
const NumpressConfig config 
)

Decodes a Base64 string to a vector of floating point numbers using numpress.

This code is obtained from the proteowizard implementation ./pwiz/pwiz/data/msdata/BinaryDataEncoder.cpp (adapted by Hannes Roest).

This function will first decode the input base64 string (with optional zlib decompression after decoding) and then apply numpress decoding to the data.

Parameters
inThe base64 encoded string
outThe resulting vector of doubles
zlib_compressionWhether to apply zlib de-compression before numpress de-compression
configThe numpress configuration defining the compression strategy
Exceptions
throwsException::ConversionError if the string cannot be converted

◆ decodeNPInternal_()

void decodeNPInternal_ ( const unsigned char *  in,
size_t  in_size,
std::vector< double > &  out,
const NumpressConfig config 
)
private

◆ decodeNPRaw()

void decodeNPRaw ( const std::string &  in,
std::vector< double > &  out,
const NumpressConfig config 
)

Decode the raw byte array "in" to the result vector "out".

Note
The string in should *only* contain the data and _no_ extra null terminating byte.

This performs the raw numpress decoding on a raw byte array (not Base64 encoded). Therefore the input string is likely *unsafe* to handle and is basically a byte container.

Please use the safe versions above unless you only have the raw byte arrays.

Parameters
inThe base64 encoded string
outThe resulting vector of doubles
configThe numpress configuration defining the compression strategy
Exceptions
throwsException::ConversionError if the string cannot be converted

◆ encodeNP() [1/2]

void encodeNP ( const std::vector< double > &  in,
String result,
bool  zlib_compression,
const NumpressConfig config 
)

Encodes a vector of floating point numbers into a Base64 string using numpress.

This code is obtained from the proteowizard implementation ./pwiz/pwiz/data/msdata/BinaryDataEncoder.cpp (adapted by Hannes Roest).

This function will first apply the numpress encoding to the data, then encode the result in base64 (with optional zlib compression before base64 encoding).

Note
In case of error, result string is empty
Parameters
inThe vector of floating point numbers to be encoded
resultThe resulting string
zlib_compressionWhether to apply zlib compression after numpress compression
configThe numpress configuration defining the compression strategy

◆ encodeNP() [2/2]

void encodeNP ( const std::vector< float > &  in,
String result,
bool  zlib_compression,
const NumpressConfig config 
)

encodeNP from a float (convert first to double)

◆ encodeNPRaw()

void encodeNPRaw ( const std::vector< double > &  in,
String result,
const NumpressConfig config 
)

Encode the data vector "in" to a raw byte array.

Note
In case of error, "result" is given back unmodified
The result is not a string but a raw byte array and may contain zero bytes

This performs the raw numpress encoding on a set of data and does no Base64 encoding on the result. Therefore the result string is likely *unsafe* to handle and is a raw byte array.

Please use the safe versions above unless you need access to the raw byte arrays.

Parameters
inThe vector of floating point numbers to be encoded
resultThe resulting string
configThe numpress configuration defining the compression strategy

Member Data Documentation

◆ NamesOfNumpressCompression

const std::string NamesOfNumpressCompression[SIZE_OF_NUMPRESSCOMPRESSION]
static