OpenMS
TransitionTSVFile Class Reference

This class supports reading and writing of OpenSWATH transition lists. More...

#include <OpenMS/ANALYSIS/OPENSWATH/TransitionTSVFile.h>

Inheritance diagram for TransitionTSVFile:
[legend]
Collaboration diagram for TransitionTSVFile:
[legend]

Classes

struct  TSVTransition
 Internal structure to represent a transition. More...
 

Private Member Functions

Reader helper functions
void getTSVHeader_ (const std::string &line, char &delimiter, std::map< std::string, int > &header_dict) const
 Determine separator in a CSV file and check for correct headers. More...
 
void readUnstructuredTSVInput_ (const char *filename, FileTypes::Type filetype, std::vector< TSVTransition > &transition_list)
 Read tab or comma separated input with columns defined by their column headers only. More...
 
void spectrastRTExtract (const String &str_inp, double &value, bool &spectrast_legacy)
 Extract retention time from a SpectraST comment string. More...
 
bool spectrastAnnotationExtract (const String &str_inp, TSVTransition &mytransition)
 Extract annotation from a SpectraST comment string. More...
 
void cleanupTransitions_ (TSVTransition &mytransition)
 Cleanup of the read fields (removing quotes etc.) More...
 

Conversion functions from TSVTransition objects to OpenMS datastructures

These functions convert the relevant data from a TSVTransition to the datastructures used by the TraML handler or by the internal LightTargetedExperiment.

typedef std::vector< OpenMS::TargetedExperiment::ProteinProteinVectorType
 
typedef std::vector< OpenMS::TargetedExperiment::PeptidePeptideVectorType
 
typedef std::vector< OpenMS::ReactionMonitoringTransitionTransitionVectorType
 
String retentionTimeInterpretation_
 
bool override_group_label_check_
 
bool force_invalid_mods_
 
static const char * strarray_ []
 
static const std::vector< std::string > header_names_
 
void TSVToTargetedExperiment_ (std::vector< TSVTransition > &transition_list, OpenMS::TargetedExperiment &exp)
 Convert a list of TSVTransition to a TargetedExperiment. More...
 
void TSVToTargetedExperiment_ (std::vector< TSVTransition > &transition_list, OpenSwath::LightTargetedExperiment &exp)
 Convert a list of TSVTransition to a LightTargetedExperiment. More...
 
TransitionTSVFile::TSVTransition convertTransition_ (const ReactionMonitoringTransition *it, OpenMS::TargetedExperiment &targeted_exp)
 Convert an OpenMS transition to a TSVTransition for output writing. More...
 
void updateMembers_ () override
 Synchronize members with param class. More...
 

Conversion helper functions

void resolveMixedSequenceGroups_ (std::vector< TSVTransition > &transition_list) const
 Resolve cases where the same peptide label group has different sequences. More...
 
void createTransition_ (std::vector< TSVTransition >::iterator &tr_it, OpenMS::ReactionMonitoringTransition &rm_trans)
 Populate a new ReactionMonitoringTransition object from a row in the csv. More...
 
void createProtein_ (String protein_name, const String &uniprot_id, OpenMS::TargetedExperiment::Protein &protein)
 Populate a new TargetedExperiment::Protein object from a row in the csv. More...
 
void interpretRetentionTime_ (std::vector< TargetedExperiment::RetentionTime > &retention_times, const OpenMS::DataValue &rt_value)
 Helper function to assign retention times to compounds and peptides. More...
 
void createPeptide_ (std::vector< TSVTransition >::const_iterator tr_it, OpenMS::TargetedExperiment::Peptide &peptide)
 Populate a new TargetedExperiment::Peptide object from a row in the csv. More...
 
void createCompound_ (std::vector< TSVTransition >::const_iterator tr_it, OpenMS::TargetedExperiment::Compound &compound)
 Populate a new TargetedExperiment::Compound object (a metabolite) from a row in the csv. More...
 
void addModification_ (std::vector< TargetedExperiment::Peptide::Modification > &mods, int location, const ResidueModification &rmod)
 Add a modification at the specified location. More...
 
void writeTSVOutput_ (const char *filename, OpenMS::TargetedExperiment &targeted_exp)
 Write a TargetedExperiment to a file. More...
 
 TransitionTSVFile ()
 Constructor. More...
 
 ~TransitionTSVFile () override
 Destructor. More...
 
void convertTargetedExperimentToTSV (const char *filename, OpenMS::TargetedExperiment &targeted_exp)
 Write out a targeted experiment (TraML structure) into a tsv file. More...
 
void convertTSVToTargetedExperiment (const char *filename, FileTypes::Type filetype, OpenMS::TargetedExperiment &targeted_exp)
 Read in a tsv/mrm file and construct a targeted experiment (TraML structure) More...
 
void convertTSVToTargetedExperiment (const char *filename, FileTypes::Type filetype, OpenSwath::LightTargetedExperiment &targeted_exp)
 Read in a tsv file and construct a targeted experiment (Light transition structure) More...
 
void validateTargetedExperiment (const OpenMS::TargetedExperiment &targeted_exp)
 Validate a TargetedExperiment (check that all ids are unique) More...
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress () const
 Ends the progress display. More...
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Static Protected Member Functions inherited from ProgressLogger
static String logTypeToFactoryName_ (LogType type)
 Return the name of the factory product used for this log type. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

This class supports reading and writing of OpenSWATH transition lists.

The transition lists can be either comma- or tab-separated plain text files (CSV or TSV format). Modifications should be provided in UniMod format1, but can also be provided in TPP format. For another file format that stores transitions, see also TransitionPQPFile.

The following columns are required:

PrecursorMz* float This describes the precursor ion m/z
ProductMz* float; synonyms: FragmentMz This specifies the product ion m/z
LibraryIntensity* float; synonyms: RelativeFragmentIntensity This specifies the relative intensity of the fragment ion
NormalizedRetentionTime* float; synonyms: RetentionTime, Tr_recalibrated, iRT, RetentionTimeCalculatorScore This specifies the expected retention time (normalized retention time)

For targeted proteomics files, the following additional columns should be provided:

GeneName** free text; Gene name (unique gene identifier)
ProteinId** free text; synonyms: ProteinNameProtein identifier
PeptideSequence** free text sequence only (no modifications); synonyms: Sequence, StrippedSequence
ModifiedPeptideSequence** free text should contain modifications1; synonyms: FullUniModPeptideName, FullPeptideName, ModifiedSequence
PrecursorCharge** integer; synonyms: Charge contains the charge of the precursor ion
ProductCharge** integer; synonyms: FragmentCharge contains the fragment charge
FragmentType free text contains the type of the fragment, e.g. "b" or "y"
FragmentSeriesNumber integer; synonyms: FragmentNumber e.g. for y7 use "7" here

OpenSWATH uses grouped transitions to detect candidate analyte signals. These groups are by default generated based on the input, but can also be manually specified:

TransitionGroupId** free text; synomys: TransitionGroupName, transition_group_iddesignates the transition group [e.g. peptide] to which this transition belongs
TransitionId** free text; synonyms: TransitionName, transition_name needs to be unique for each transition [in this file]
Decoy 1: decoy, 0: target; synomys: decoy, IsDecoy determines whether the transition is a decoy transition or not
PeptideGroupLabel free text designates to which peptide label group (as defined in MS:1000893) the peptide belongs to2
DetectingTransition 0 or 1; synonyms: detecting_transition 1: use transition to detect peak group, 0: don't use transition for detection
IdentifyingTransition 0 or 1; synonyms: identifying_transition 1: use transition for peptidoform inference in the IPF Workflow, 0: don't use transition for identification
QuantifyingTransition 0 or 1; synonyms: quantifying_transition 1: use transition to quantify peak group, 0: don't use transition for quantification

Optionally, the following columns can be specified but they are not actively used by OpenSWATH:

CollisionEnergy float; synonyms: CECollision energy
Annotation free textTransition-level annotation, e.g. y7
UniprotId free text; synonyms: UniprotID A Uniprot identifier
LabelType free textoptional description of which label was used, e.g. heavy or light

For targeted metabolomics files, the following fields are also supported:

CompoundName** free text; synonyms: CompoundIdShould be unique for the analyte, if present the file will be interpreted as a metabolomics file
SMILESfree textSMILES identifier of the compound
SumFormulafree textmolecular formula of the compound (e.g. H2O)

Fields indicated with * are strictly required to create a TraML file. Fields indicated with ** are recommended, but only required for a specific application (such as using the transition list for an analysis tool such as OpenSwathWorkflow) or in a specific context (proteomics or metabolomics).

Remarks:

  • 1. modifications should be supplied inside the sequence using UniMod identifiers or freetext identifiers that are understood by OpenMS. See also OpenMS::AASequence for more information. For example:
    • PEPT(Phosphorylation)IDE(UniMod:27)A )
  • 2. peptide label groups designate groups of peptides that are isotopically modified forms of the same peptide species. For example, the heavy and light forms of the same peptide will both be assigned the same peptide group label. For example:
    • PEPTIDEAK -> gets label "PEPTIDEAK_gr1"
    • PEPTIDEAK[+8] -> gets label "PEPTIDEAK_gr1"
    • PEPT(Phosphorylation)IDEAK -> gets label "PEPTIDEAK_gr2"
    • PEPT(Phosphorylation)IDEAK[+8] -> gets label "PEPTIDEAK_gr2"
Parameters of this class are:

NameTypeDefaultRestrictionsDescription
retentionTimeInterpretation stringiRT iRT, seconds, minutesHow to interpret the provided retention time (the retention time column can either be interpreted to be in iRT, minutes or seconds)
override_group_label_check stringfalse true, falseOverride an internal check that assures that all members of the same PeptideGroupLabel have the same PeptideSequence (this ensures that only different isotopic forms of the same peptide can be grouped together in the same label group). Only turn this off if you know what you are doing.
force_invalid_mods stringfalse true, falseForce reading even if invalid modifications are encountered (OpenMS may not recognize the modification)

Note:
  • If a section name is documented, the documentation is displayed as tooltip.
  • Advanced parameter names are italic.

Member Typedef Documentation

◆ PeptideVectorType

◆ ProteinVectorType

◆ TransitionVectorType

Constructor & Destructor Documentation

◆ TransitionTSVFile()

Constructor.

◆ ~TransitionTSVFile()

~TransitionTSVFile ( )
override

Destructor.

Member Function Documentation

◆ addModification_()

void addModification_ ( std::vector< TargetedExperiment::Peptide::Modification > &  mods,
int  location,
const ResidueModification rmod 
)
private

Add a modification at the specified location.

◆ cleanupTransitions_()

void cleanupTransitions_ ( TSVTransition mytransition)
private

Cleanup of the read fields (removing quotes etc.)

◆ convertTargetedExperimentToTSV()

void convertTargetedExperimentToTSV ( const char *  filename,
OpenMS::TargetedExperiment targeted_exp 
)

Write out a targeted experiment (TraML structure) into a tsv file.

Parameters
filenameThe output file
targeted_expThe targeted experiment

◆ convertTransition_()

TransitionTSVFile::TSVTransition convertTransition_ ( const ReactionMonitoringTransition it,
OpenMS::TargetedExperiment targeted_exp 
)
protected

Convert an OpenMS transition to a TSVTransition for output writing.

◆ convertTSVToTargetedExperiment() [1/2]

void convertTSVToTargetedExperiment ( const char *  filename,
FileTypes::Type  filetype,
OpenMS::TargetedExperiment targeted_exp 
)

Read in a tsv/mrm file and construct a targeted experiment (TraML structure)

Parameters
filenameThe input file
filetypeThe type of file ("mrm" or "tsv")
targeted_expThe output targeted experiment

Referenced by TOPPOpenSwathBase::loadTransitionList().

◆ convertTSVToTargetedExperiment() [2/2]

void convertTSVToTargetedExperiment ( const char *  filename,
FileTypes::Type  filetype,
OpenSwath::LightTargetedExperiment targeted_exp 
)

Read in a tsv file and construct a targeted experiment (Light transition structure)

Parameters
filenameThe input file
filetypeThe type of file ("mrm" or "tsv")
targeted_expThe output targeted experiment

◆ createCompound_()

void createCompound_ ( std::vector< TSVTransition >::const_iterator  tr_it,
OpenMS::TargetedExperiment::Compound compound 
)
private

Populate a new TargetedExperiment::Compound object (a metabolite) from a row in the csv.

◆ createPeptide_()

void createPeptide_ ( std::vector< TSVTransition >::const_iterator  tr_it,
OpenMS::TargetedExperiment::Peptide peptide 
)
private

Populate a new TargetedExperiment::Peptide object from a row in the csv.

◆ createProtein_()

void createProtein_ ( String  protein_name,
const String uniprot_id,
OpenMS::TargetedExperiment::Protein protein 
)
private

Populate a new TargetedExperiment::Protein object from a row in the csv.

◆ createTransition_()

void createTransition_ ( std::vector< TSVTransition >::iterator &  tr_it,
OpenMS::ReactionMonitoringTransition rm_trans 
)
private

Populate a new ReactionMonitoringTransition object from a row in the csv.

◆ getTSVHeader_()

void getTSVHeader_ ( const std::string &  line,
char &  delimiter,
std::map< std::string, int > &  header_dict 
) const
private

Determine separator in a CSV file and check for correct headers.

Parameters
lineThe header to be parsed
delimiterThe delimiter which will be determined from the input
header_dictThe map which maps the fields in the header to their position

◆ interpretRetentionTime_()

void interpretRetentionTime_ ( std::vector< TargetedExperiment::RetentionTime > &  retention_times,
const OpenMS::DataValue rt_value 
)
private

Helper function to assign retention times to compounds and peptides.

◆ readUnstructuredTSVInput_()

void readUnstructuredTSVInput_ ( const char *  filename,
FileTypes::Type  filetype,
std::vector< TSVTransition > &  transition_list 
)
private

Read tab or comma separated input with columns defined by their column headers only.

Parameters
filenameThe input file
filetypeThe type of file ("mrm" or "tsv")
transition_listThe output list of transitions

◆ resolveMixedSequenceGroups_()

void resolveMixedSequenceGroups_ ( std::vector< TSVTransition > &  transition_list) const
private

Resolve cases where the same peptide label group has different sequences.

Since members in a peptide label group (MS:1000893) should only be isotopically modified forms of the same peptide, having different peptide sequences (different AA sequences) within the same group most likely constitutes an error. This function will fix the error by erasing the provided "peptide group label" for a peptide and replace it with the peptide identifier (transition group id).

Parameters
transition_listThe list of transitions to be fixed.

◆ spectrastAnnotationExtract()

bool spectrastAnnotationExtract ( const String str_inp,
TSVTransition mytransition 
)
private

Extract annotation from a SpectraST comment string.

◆ spectrastRTExtract()

void spectrastRTExtract ( const String str_inp,
double &  value,
bool &  spectrast_legacy 
)
private

Extract retention time from a SpectraST comment string.

◆ TSVToTargetedExperiment_() [1/2]

void TSVToTargetedExperiment_ ( std::vector< TSVTransition > &  transition_list,
OpenMS::TargetedExperiment exp 
)
protected

Convert a list of TSVTransition to a TargetedExperiment.

Converts the list (read from csv/mrm) file into a object model using the TargetedExperiment with proper hierarchical structure from Transition to Peptide to Protein.

◆ TSVToTargetedExperiment_() [2/2]

void TSVToTargetedExperiment_ ( std::vector< TSVTransition > &  transition_list,
OpenSwath::LightTargetedExperiment exp 
)
protected

Convert a list of TSVTransition to a LightTargetedExperiment.

Converts the list (read from csv/mrm) file into a object model using the LightTargetedExperiment with proper hierarchical structure from Transition to Peptide to Protein.

◆ updateMembers_()

void updateMembers_ ( )
overrideprotectedvirtual

Synchronize members with param class.

Reimplemented from DefaultParamHandler.

◆ validateTargetedExperiment()

void validateTargetedExperiment ( const OpenMS::TargetedExperiment targeted_exp)

Validate a TargetedExperiment (check that all ids are unique)

◆ writeTSVOutput_()

void writeTSVOutput_ ( const char *  filename,
OpenMS::TargetedExperiment targeted_exp 
)
private

Write a TargetedExperiment to a file.

Parameters
filenameName of the output file
targeted_expThe data structure to be written to the file

Member Data Documentation

◆ force_invalid_mods_

bool force_invalid_mods_
private

◆ header_names_

const std::vector<std::string> header_names_
staticprivate

◆ override_group_label_check_

bool override_group_label_check_
private

◆ retentionTimeInterpretation_

String retentionTimeInterpretation_
private

◆ strarray_

const char* strarray_[]
staticprivate