OpenMS
IdXMLFile Class Reference

Used to load and store idXML files. More...

#include <OpenMS/FORMAT/IdXMLFile.h>

Inheritance diagram for IdXMLFile:
[legend]
Collaboration diagram for IdXMLFile:
[legend]

Public Member Functions

 IdXMLFile ()
 Constructor. More...
 
void load (const String &filename, std::vector< ProteinIdentification > &protein_ids, std::vector< PeptideIdentification > &peptide_ids)
 Loads the identifications of an idXML file without identifier. More...
 
void load (const String &filename, std::vector< ProteinIdentification > &protein_ids, std::vector< PeptideIdentification > &peptide_ids, String &document_id)
 Loads the identifications of an idXML file. More...
 
void store (const String &filename, const std::vector< ProteinIdentification > &protein_ids, const std::vector< PeptideIdentification > &peptide_ids, const String &document_id="")
 Stores the data in an idXML file. More...
 
- Public Member Functions inherited from XMLFile
 XMLFile ()
 Default constructor. More...
 
 XMLFile (const String &schema_location, const String &version)
 Constructor that sets the schema location. More...
 
virtual ~XMLFile ()
 Destructor. More...
 
bool isValid (const String &filename, std::ostream &os)
 Checks if a file validates against the XML schema. More...
 
const StringgetVersion () const
 return the version of the schema More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void setLogger (ProgressLoggerImpl *logger)
 Sets the logger to be used for progress logging. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Protected Member Functions

void endElement (const XMLCh *const, const XMLCh *const, const XMLCh *const qname) override
 
void startElement (const XMLCh *const, const XMLCh *const, const XMLCh *const qname, const xercesc::Attributes &attributes) override
 
void addProteinGroups_ (MetaInfoInterface &meta, const std::vector< ProteinIdentification::ProteinGroup > &groups, const String &group_name, const std::unordered_map< std::string, UInt > &accession_to_id, XMLHandler::ActionMode mode)
 
void getProteinGroups_ (std::vector< ProteinIdentification::ProteinGroup > &groups, const String &group_name)
 Read and store ProteinGroup data. More...
 
- Protected Member Functions inherited from XMLHandler
void writeUserParam_ (const String &tag_name, std::ostream &os, const MetaInfoInterface &meta, UInt indent) const
 Writes the content of MetaInfoInterface to the file. More...
 
Int asInt_ (const String &in) const
 Conversion of a String to an integer value. More...
 
Int asInt_ (const XMLCh *in) const
 Conversion of a Xerces string to an integer value. More...
 
UInt asUInt_ (const String &in) const
 Conversion of a String to an unsigned integer value. More...
 
double asDouble_ (const String &in) const
 Conversion of a String to a double value. More...
 
float asFloat_ (const String &in) const
 Conversion of a String to a float value. More...
 
bool asBool_ (const String &in) const
 Conversion of a string to a boolean value. More...
 
DateTime asDateTime_ (String date_string) const
 Conversion of a xs:datetime string to a DateTime value. More...
 
bool equal_ (const XMLCh *a, const XMLCh *b) const
 Returns if two Xerces strings are equal. More...
 
SignedSize cvStringToEnum_ (const Size section, const String &term, const char *message, const SignedSize result_on_error=0)
 
String attributeAsString_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a String. More...
 
Int attributeAsInt_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a Int. More...
 
double attributeAsDouble_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a double. More...
 
DoubleList attributeAsDoubleList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a DoubleList. More...
 
IntList attributeAsIntList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to an IntList. More...
 
StringList attributeAsStringList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to an StringList. More...
 
bool optionalAttributeAsString_ (String &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the String value if the attribute is present. More...
 
bool optionalAttributeAsInt_ (Int &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the Int value if the attribute is present. More...
 
bool optionalAttributeAsUInt_ (UInt &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the UInt value if the attribute is present. More...
 
bool optionalAttributeAsDouble_ (double &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the double value if the attribute is present. More...
 
bool optionalAttributeAsDoubleList_ (DoubleList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the DoubleList value if the attribute is present. More...
 
bool optionalAttributeAsStringList_ (StringList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the StringList value if the attribute is present. More...
 
bool optionalAttributeAsIntList_ (IntList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the IntList value if the attribute is present. More...
 
String attributeAsString_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a String. More...
 
Int attributeAsInt_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a Int. More...
 
double attributeAsDouble_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a double. More...
 
DoubleList attributeAsDoubleList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a DoubleList. More...
 
IntList attributeAsIntList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a IntList. More...
 
StringList attributeAsStringList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a StringList. More...
 
bool optionalAttributeAsString_ (String &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the String value if the attribute is present. More...
 
bool optionalAttributeAsInt_ (Int &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the Int value if the attribute is present. More...
 
bool optionalAttributeAsUInt_ (UInt &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the UInt value if the attribute is present. More...
 
bool optionalAttributeAsDouble_ (double &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the double value if the attribute is present. More...
 
bool optionalAttributeAsDoubleList_ (DoubleList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the DoubleList value if the attribute is present. More...
 
bool optionalAttributeAsIntList_ (IntList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the IntList value if the attribute is present. More...
 
bool optionalAttributeAsStringList_ (StringList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the StringList value if the attribute is present. More...
 
 XMLHandler (const String &filename, const String &version)
 Default constructor. More...
 
 ~XMLHandler () override
 Destructor. More...
 
void reset ()
 Release internal memory used for parsing (call. More...
 
void fatalError (const xercesc::SAXParseException &exception) override
 
void error (const xercesc::SAXParseException &exception) override
 
void warning (const xercesc::SAXParseException &exception) override
 
void fatalError (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Fatal error handler. Throws a ParseError exception. More...
 
void error (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Error handler for recoverable errors. More...
 
void warning (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Warning handler. More...
 
void characters (const XMLCh *const chars, const XMLSize_t length) override
 Parsing method for character data. More...
 
void startElement (const XMLCh *const uri, const XMLCh *const localname, const XMLCh *const qname, const xercesc::Attributes &attrs) override
 Parsing method for opening tags. More...
 
void endElement (const XMLCh *const uri, const XMLCh *const localname, const XMLCh *const qname) override
 Parsing method for closing tags. More...
 
virtual void writeTo (std::ostream &)
 Writes the contents to a stream. More...
 
virtual LOADDETAIL getLoadDetail () const
 handler which support partial loading, implement this method More...
 
virtual void setLoadDetail (const LOADDETAIL d)
 handler which support partial loading, implement this method More...
 
DataValue cvParamToValue (const ControlledVocabulary &cv, const String &parent_tag, const String &accession, const String &name, const String &value, const String &unit_accession) const
 Convert the value of a <cvParam value=.> (as commonly found in PSI schemata) to the DataValue with the correct type (e.g. int) according to the type stored in the CV (usually PSI-MS CV), as well as set its unit. More...
 
DataValue cvParamToValue (const ControlledVocabulary &cv, const CVTerm &raw_term) const
 Convert the value of a <cvParam value=.> (as commonly found in PSI schemata) to the DataValue with the correct type (e.g. int) according to the type stored in the CV (usually PSI-MS CV), as well as set its unit. More...
 
void checkUniqueIdentifiers_ (const std::vector< ProteinIdentification > &prot_ids) const
 
- Protected Member Functions inherited from XMLFile
void parse_ (const String &filename, XMLHandler *handler)
 Parses the XML file given by filename using the handler given by handler. More...
 
void parseBuffer_ (const std::string &buffer, XMLHandler *handler)
 Parses the in-memory buffer given by buffer using the handler given by handler. More...
 
void save_ (const String &filename, XMLHandler *handler) const
 Stores the contents of the XML handler given by handler in the file given by filename. More...
 
void enforceEncoding_ (const String &encoding)
 
 XMLFile ()
 Default constructor. More...
 
 XMLFile (const String &schema_location, const String &version)
 Constructor that sets the schema location. More...
 
virtual ~XMLFile ()
 Destructor. More...
 
bool isValid (const String &filename, std::ostream &os)
 Checks if a file validates against the XML schema. More...
 
const StringgetVersion () const
 return the version of the schema More...
 
- Protected Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void setLogger (ProgressLoggerImpl *logger)
 Sets the logger to be used for progress logging. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Static Protected Member Functions

static std::ostream & createFlankingAAXMLString_ (const std::vector< PeptideEvidence > &pes, std::ostream &os)
 
static std::ostream & createPositionXMLString_ (const std::vector< PeptideEvidence > &pes, std::ostream &os)
 
static void writeFragmentAnnotations_ (const String &tag_name, std::ostream &os, const std::vector< PeptideHit::PeakAnnotation > &annotations, UInt indent)
 
static void parseFragmentAnnotation_ (const String &s, std::vector< PeptideHit::PeakAnnotation > &annotations)
 
- Static Protected Member Functions inherited from XMLHandler
static String writeXMLEscape (const String &to_escape)
 Escapes a string and returns the escaped string. More...
 
static DataValue fromXSDString (const String &type, const String &value)
 Convert an XSD type (e.g. 'xsd:double') to a DataValue. More...
 

Protected Attributes

members for loading data
std::vector< ProteinIdentification > * prot_ids_
 Pointer to fill in protein identifications. More...
 
std::vector< PeptideIdentification > * pep_ids_
 Pointer to fill in peptide identifications. More...
 
MetaInfoInterfacelast_meta_
 Pointer to last read object with MetaInfoInterface. More...
 
std::map< String, ProteinIdentification::SearchParametersparameters_
 Search parameters map (key is the "id") More...
 
ProteinIdentification::SearchParameters param_
 Temporary search parameters variable. More...
 
String id_
 Temporary id. More...
 
ProteinIdentification prot_id_
 Temporary protein ProteinIdentification. More...
 
PeptideIdentification pep_id_
 Temporary peptide ProteinIdentification. More...
 
ProteinHit prot_hit_
 Temporary protein hit. More...
 
PeptideHit pep_hit_
 Temporary peptide hit. More...
 
PeptideHit::PepXMLAnalysisResult current_analysis_result_
 Temporary analysis result instance. More...
 
std::vector< PeptideEvidencepeptide_evidences_
 Temporary peptide evidences. More...
 
std::unordered_map< std::string, Stringproteinid_to_accession_
 Map from protein id to accession. More...
 
Stringdocument_id_
 Document identifier. More...
 
bool prot_id_in_run_
 true if a prot id is contained in the current run More...
 
- Protected Attributes inherited from XMLHandler
String file_
 File name. More...
 
String version_
 Schema version. More...
 
StringManager sm_
 Helper class for string conversion. More...
 
std::vector< Stringopen_tags_
 Stack of open XML tags. More...
 
LOADDETAIL load_detail_
 parse only until total number of scans and chroms have been determined from attributes More...
 
std::vector< std::vector< String > > cv_terms_
 Array of CV term lists (one sublist denotes one term and it's children) More...
 
- Protected Attributes inherited from XMLFile
String schema_location_
 XML schema file location. More...
 
String schema_version_
 Version string. More...
 
String enforced_encoding_
 Encoding string that replaces the encoding (system dependent or specified in the XML). Disabled if empty. Used as a workaround for XTandem output xml. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 

Friends

class Internal::ConsensusXMLHandler
 
class Internal::FeatureXMLHandler
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Protected Types inherited from XMLHandler
enum  ActionMode { LOAD , STORE }
 Action to set the current mode (for error messages) More...
 
enum  LOADDETAIL { LD_ALLDATA , LD_RAWCOUNTS , LD_COUNTS_WITHOPTIONS }
 
- Protected Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

Used to load and store idXML files.

This class is used to load and store documents that implement the schema of idXML files.

A documented schema for this format can be found at https://github.com/OpenMS/OpenMS/tree/develop/share/OpenMS/SCHEMAS

One file can contain several ProteinIdentification runs. Each run consists of peptide hits stored in PeptideIdentification and (optional) protein hits stored in Identification. Peptide and protein hits are connected via a string identifier. We use the search engine and the date as identifier.

Note
This format will eventually be replaced by the HUPO-PSI (mzIdentML and mzQuantML)) AnalysisXML formats!

Constructor & Destructor Documentation

◆ IdXMLFile()

IdXMLFile ( )

Constructor.

Member Function Documentation

◆ addProteinGroups_()

void addProteinGroups_ ( MetaInfoInterface meta,
const std::vector< ProteinIdentification::ProteinGroup > &  groups,
const String group_name,
const std::unordered_map< std::string, UInt > &  accession_to_id,
XMLHandler::ActionMode  mode 
)
protected

Add data from ProteinGroups to a MetaInfoInterface Since it can be used during load and store, it needs to take a param for the current mode (LOAD/STORE) to throw appropriate warnings/errors

◆ createFlankingAAXMLString_()

static std::ostream& createFlankingAAXMLString_ ( const std::vector< PeptideEvidence > &  pes,
std::ostream &  os 
)
staticprotected

Helper function to create the XML string for the amino acids before and after the peptide position in a protein. Can be reused by e.g. ConsensusXML, FeatureXML to write PeptideHit elements

◆ createPositionXMLString_()

static std::ostream& createPositionXMLString_ ( const std::vector< PeptideEvidence > &  pes,
std::ostream &  os 
)
staticprotected

Helper function to create the XML string for the position of the peptide in a protein. Can be reused by e.g. ConsensusXML, FeatureXML to write PeptideHit elements

◆ endElement()

void endElement ( const XMLCh * const  ,
const XMLCh * const  ,
const XMLCh *const  qname 
)
overrideprotected

◆ getProteinGroups_()

void getProteinGroups_ ( std::vector< ProteinIdentification::ProteinGroup > &  groups,
const String group_name 
)
protected

Read and store ProteinGroup data.

◆ load() [1/2]

void load ( const String filename,
std::vector< ProteinIdentification > &  protein_ids,
std::vector< PeptideIdentification > &  peptide_ids 
)

Loads the identifications of an idXML file without identifier.

The information is read in and the information is stored in the corresponding variables

Exceptions
Exception::FileNotFoundis thrown if the file could not be opened
Exception::ParseErroris thrown if an error occurs during parsing

◆ load() [2/2]

void load ( const String filename,
std::vector< ProteinIdentification > &  protein_ids,
std::vector< PeptideIdentification > &  peptide_ids,
String document_id 
)

Loads the identifications of an idXML file.

The information is read in and the information is stored in the corresponding variables

Exceptions
Exception::FileNotFoundis thrown if the file could not be opened
Exception::ParseErroris thrown if an error occurs during parsing

◆ parseFragmentAnnotation_()

static void parseFragmentAnnotation_ ( const String s,
std::vector< PeptideHit::PeakAnnotation > &  annotations 
)
staticprotected

Helper function to parse fragment annotations from string

◆ startElement()

void startElement ( const XMLCh * const  ,
const XMLCh * const  ,
const XMLCh *const  qname,
const xercesc::Attributes &  attributes 
)
overrideprotected

◆ store()

void store ( const String filename,
const std::vector< ProteinIdentification > &  protein_ids,
const std::vector< PeptideIdentification > &  peptide_ids,
const String document_id = "" 
)

Stores the data in an idXML file.

The data is read in and stored in the file 'filename'. PeptideHits are sorted by score. Note that ranks are not stored and need to be reassigned after loading.

Exceptions
Exception::UnableToCreateFileis thrown if the file could not be created

◆ writeFragmentAnnotations_()

static void writeFragmentAnnotations_ ( const String tag_name,
std::ostream &  os,
const std::vector< PeptideHit::PeakAnnotation > &  annotations,
UInt  indent 
)
staticprotected

Helper function to write out fragment annotations as user param fragment_annotation

Friends And Related Function Documentation

◆ Internal::ConsensusXMLHandler

friend class Internal::ConsensusXMLHandler
friend

◆ Internal::FeatureXMLHandler

friend class Internal::FeatureXMLHandler
friend

Member Data Documentation

◆ current_analysis_result_

PeptideHit::PepXMLAnalysisResult current_analysis_result_
protected

Temporary analysis result instance.

◆ document_id_

String* document_id_
protected

Document identifier.

◆ id_

String id_
protected

Temporary id.

◆ last_meta_

MetaInfoInterface* last_meta_
protected

Pointer to last read object with MetaInfoInterface.

◆ param_

Temporary search parameters variable.

◆ parameters_

std::map<String, ProteinIdentification::SearchParameters> parameters_
protected

Search parameters map (key is the "id")

◆ pep_hit_

PeptideHit pep_hit_
protected

Temporary peptide hit.

◆ pep_id_

PeptideIdentification pep_id_
protected

Temporary peptide ProteinIdentification.

◆ pep_ids_

std::vector<PeptideIdentification>* pep_ids_
protected

Pointer to fill in peptide identifications.

◆ peptide_evidences_

std::vector<PeptideEvidence> peptide_evidences_
protected

Temporary peptide evidences.

◆ prot_hit_

ProteinHit prot_hit_
protected

Temporary protein hit.

◆ prot_id_

ProteinIdentification prot_id_
protected

Temporary protein ProteinIdentification.

◆ prot_id_in_run_

bool prot_id_in_run_
protected

true if a prot id is contained in the current run

◆ prot_ids_

std::vector<ProteinIdentification>* prot_ids_
protected

Pointer to fill in protein identifications.

◆ proteinid_to_accession_

std::unordered_map<std::string, String> proteinid_to_accession_
protected

Map from protein id to accession.