OpenMS
Loading...
Searching...
No Matches
OpenPepXLAlgorithm Class Reference

Search for peptide pairs linked with a labeled cross-linker. More...

#include <OpenMS/ANALYSIS/XLMS/OpenPepXLAlgorithm.h>

Inheritance diagram for OpenPepXLAlgorithm:
[legend]
Collaboration diagram for OpenPepXLAlgorithm:
[legend]

Public Types

enum class  ExitCodes { EXECUTION_OK , ILLEGAL_PARAMETERS , UNEXPECTED_RESULT , INCOMPATIBLE_INPUT_DATA }
 Outcome of run, distinguishing successful execution from configuration / input problems detected before the search is attempted. More...
 
- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 

Public Member Functions

 OpenPepXLAlgorithm ()
 Default constructor.
 
 ~OpenPepXLAlgorithm () override
 Default destructor.
 
ExitCodes run (PeakMap &unprocessed_spectra, ConsensusMap &cfeatures, std::vector< FASTAFile::FASTAEntry > &fasta_db, std::vector< ProteinIdentification > &protein_ids, PeptideIdentificationList &peptide_ids, OPXLDataStructs::PreprocessedPairSpectra &preprocessed_pair_spectra, std::vector< std::pair< Size, Size > > &spectrum_pairs, std::vector< std::vector< OPXLDataStructs::CrossLinkSpectrumMatch > > &all_top_csms, PeakMap &spectra)
 Performs the main function of this class, the search for cross-linked peptides.
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const std::string &name)
 Constructor with name that is displayed in error messages.
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor.
 
virtual ~DefaultParamHandler ()
 Destructor.
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator.
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator.
 
void setParameters (const Param &param)
 Sets the parameters.
 
const ParamgetParameters () const
 Non-mutable access to the parameters.
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters.
 
const std::string & getName () const
 Non-mutable access to the name.
 
void setName (const std::string &name)
 Mutable access to the name.
 
const std::vector< std::string > & getSubsections () const
 Non-mutable access to the registered subsections.
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor.
 
virtual ~ProgressLogger ()
 Destructor.
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor.
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator.
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE!
 
LogType getLogType () const
 Returns the type of progress log being used.
 
void setLogger (ProgressLoggerImpl *logger)
 Sets the logger to be used for progress logging.
 
void startProgress (SignedSize begin, SignedSize end, const std::string &label) const
 Initializes the progress display.
 
void setProgress (SignedSize value) const
 Sets the current progress.
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end)
 

Private Member Functions

void updateMembers_ () override
 This method is used to update extra member variables at the end of the setParameters() method.
 

Static Private Member Functions

static OPXLDataStructs::PreprocessedPairSpectra preprocessPairs_ (const PeakMap &spectra, const std::vector< std::pair< Size, Size > > &spectrum_pairs, const double cross_link_mass_iso_shift, double fragment_mass_tolerance, double fragment_mass_tolerance_xlinks, bool fragment_mass_tolerance_unit_ppm, bool deisotope)
 Split labelled spectrum pairs into linear- and cross-link-bearing peak lists.
 

Private Attributes

std::string decoy_string_
 Cached value of parameter "decoy_string"; substring marking decoy entries in the FASTA accessions.
 
bool decoy_prefix_
 Cached value of parameter "decoy_prefix"; if true the decoy string is matched as a prefix, otherwise as a suffix.
 
Int min_precursor_charge_
 Cached value of parameter "precursor:min_charge".
 
Int max_precursor_charge_
 Cached value of parameter "precursor:max_charge".
 
double precursor_mass_tolerance_
 Cached value of parameter "precursor:mass_tolerance" (unit per precursor_mass_tolerance_unit_ppm_)
 
bool precursor_mass_tolerance_unit_ppm_
 Cached value of parameter "precursor:mass_tolerance_unit" == "ppm".
 
IntList precursor_correction_steps_
 Cached value of parameter "precursor:corrections" — monoisotopic-peak-misassignment offsets to try.
 
double fragment_mass_tolerance_
 Cached value of parameter "fragment:mass_tolerance" (linear fragment ions)
 
double fragment_mass_tolerance_xlinks_
 Cached value of parameter "fragment:mass_tolerance_xlinks" (cross-link-bearing fragment ions)
 
bool fragment_mass_tolerance_unit_ppm_
 Cached value of parameter "fragment:mass_tolerance_unit" == "ppm".
 
StringList cross_link_residue1_
 Cached value of parameter "cross_linker:residue1" — residues the first end of the linker reacts with.
 
StringList cross_link_residue2_
 Cached value of parameter "cross_linker:residue2" — residues the second end of the linker reacts with.
 
double cross_link_mass_light_
 Cached value of parameter "cross_linker:mass_light" — mass added by the light cross-linker.
 
double cross_link_mass_iso_shift_
 Cached value of parameter "cross_linker:mass_iso_shift" — mass difference heavy minus light.
 
DoubleList cross_link_mass_mono_link_
 Cached value of parameter "cross_linker:mass_mono_link" — possible mono-link masses.
 
std::string cross_link_name_
 Cached value of parameter "cross_linker:name" — used to disambiguate mass-equivalent linkers.
 
StringList fixedModNames_
 Cached value of parameter "modifications:fixed" (UniMod names); duplicates trigger ExitCodes::ILLEGAL_PARAMETERS.
 
StringList varModNames_
 Cached value of parameter "modifications:variable" (UniMod names); duplicates trigger ExitCodes::ILLEGAL_PARAMETERS.
 
Size max_variable_mods_per_peptide_
 Cached value of parameter "modifications:variable_max_per_peptide".
 
Size peptide_min_size_
 Cached value of parameter "peptide:min_size".
 
Size missed_cleavages_
 Cached value of parameter "peptide:missed_cleavages".
 
std::string enzyme_name_
 Cached value of parameter "peptide:enzyme".
 
Int number_top_hits_
 Cached value of parameter "algorithm:number_top_hits".
 
std::string deisotope_mode_
 Cached value of parameter "algorithm:deisotope" ("true" / "false" / "auto")
 
std::string add_y_ions_
 Cached value of parameter "ions:y_ions".
 
std::string add_b_ions_
 Cached value of parameter "ions:b_ions".
 
std::string add_x_ions_
 Cached value of parameter "ions:x_ions".
 
std::string add_a_ions_
 Cached value of parameter "ions:a_ions".
 
std::string add_c_ions_
 Cached value of parameter "ions:c_ions".
 
std::string add_z_ions_
 Cached value of parameter "ions:z_ions".
 
std::string add_losses_
 Cached value of parameter "ions:neutral_losses".
 

Additional Inherited Members

- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const std::string &key_prefix="")
 Writes all parameters to meta values.
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor.
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters.
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes!
 
std::vector< std::string > subsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes!
 
std::string error_name_
 Name that is displayed in error messages during the parameter checking.
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;.
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;.
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

Search for peptide pairs linked with a labeled cross-linker.

This tool performs a search for cross-links in the given mass spectra. It uses linked MS1 features to pair up MS2 spectra and uses these pairs to find the fragment peaks that contain the linker and those that do not.

It executes the following steps in order:

  • Processing of spectra: deisotoping and filtering
  • Digesting and preprocessing the protein database, building a peptide pair index dependent on the precursor masses of the MS2 spectra
  • Generating theoretical spectra of cross-linked peptides and aligning the experimental spectra against those
  • Scoring of cross-link spectrum matches
  • Using PeptideIndexer to map the peptides to all possible source proteins

See below for available parameters and more functionality.

Input: MS2 spectra, linked features from FeatureFinderMultiplex and fasta database of proteins expected to be cross-linked in the sample

The spectra should be provided as one PeakMap. If you have multiple files, e.g. for multiple fractions, you should run this tool on each file separately. The database should be provided as a vector of FASTAEntries containing the target and decoy proteins. A ConsensusMap, that links the MS1 feature pairs from heavy and light cross-linkers is also required. This can be generated by the tool FeatureFinderMultiplex. Setting up FeatureFinderMultiplex: In the FeatureFinderMultiplex parameters you have to change the mass of one of the labels to the difference between the light and heavy (e.g. change the mass of Arg6 to 12.075321 for labeled DSS) in the advanced options. The parameter -labels should have one empty label ( [] ) and the label you adapted (e.g. [][Arg6]). For the other settings refer to the documentation of FeatureFinderMultiplex.

Parameters

The parameters for fixed and variable modifications refer to additional modifications beside the cross-linker. The linker used in the experiment has to be described using the cross-linker specific parameters. Only one mass is allowed for a cross-linker, that links two peptides (cross_linker:mass_light), while multiple masses are possible for mono-links of the same cross-linking reagent. Mono-links are cross-linkers, that are linked to one peptide by one of their two reactive groups. The masses refer to the light version of the linker. The parameter cross_linker:mass_iso_shift defines the difference between the light and heavy versions of the cross-linker and the mono-links. The parameters cross_linker:residue1 and cross_linker:residue2 are used to enumerate the amino acids, that each end of the linker can react with. This way any heterobifunctional cross-linker can be defined. To define a homobifunctional cross-linker, these two parameters should have the same value. The parameter cross_linker:name is used to solve ambiguities arising from different cross-linkers having the same mass after the linking reaction (see section on output for clarification).

Output: XL-MS Identifications with scores and linked positions in the proteins

The input parameters protein_ids and peptide_ids are filled with XL-MS search parameters and IDs

pot. predecessor tools → OpenPepXL → pot. successor tools
- -

Member Enumeration Documentation

◆ ExitCodes

enum class ExitCodes
strong

Outcome of run, distinguishing successful execution from configuration / input problems detected before the search is attempted.

Enumerator
EXECUTION_OK 

Search ran to completion; the output arguments contain the results.

ILLEGAL_PARAMETERS 

Parameter set is inconsistent — e.g. duplicate entries in modifications:fixed or modifications:variable (see .cpp lines 182, 189).

UNEXPECTED_RESULT 

Reserved sentinel; not returned by the current implementation.

INCOMPATIBLE_INPUT_DATA 

Input is unusable — the spectrum container is empty (only chromatograms), or one of its spectra is not sorted by m/z (see .cpp lines 200, 209).

Constructor & Destructor Documentation

◆ OpenPepXLAlgorithm()

Default constructor.

◆ ~OpenPepXLAlgorithm()

~OpenPepXLAlgorithm ( )
override

Default destructor.

Member Function Documentation

◆ preprocessPairs_()

static OPXLDataStructs::PreprocessedPairSpectra preprocessPairs_ ( const PeakMap spectra,
const std::vector< std::pair< Size, Size > > &  spectrum_pairs,
const double  cross_link_mass_iso_shift,
double  fragment_mass_tolerance,
double  fragment_mass_tolerance_xlinks,
bool  fragment_mass_tolerance_unit_ppm,
bool  deisotope 
)
staticprivate

Split labelled spectrum pairs into linear- and cross-link-bearing peak lists.

For every (spectrum_pairs[i].first, spectrum_pairs[i].second) pair (light/heavy spectrum) the function aligns the two spectra and uses cross_link_mass_iso_shift to tell which peaks are common to both (linear ions, no cross-linker attached) and which are present at the expected shifted m/z (peaks that carry the cross-linker). The result has three parallel PeakMap members (linear / xlink / all) sized to spectrum_pairs.size(). The outer loop is OpenMP-parallel.

Parameters
[in]spectraSource spectrum container (light and heavy MS2 stored together).
[in]spectrum_pairsIndices into spectra giving (light, heavy) pairs.
[in]cross_link_mass_iso_shiftMass difference (Da) between heavy and light cross-linker; controls the shift used for matching.
[in]fragment_mass_toleranceTolerance for matching linear (un-shifted) fragment ions.
[in]fragment_mass_tolerance_xlinksTolerance for matching cross-link-bearing fragment ions.
[in]fragment_mass_tolerance_unit_ppmIf true, both tolerances are in ppm; otherwise Th.
[in]deisotopeIf true, the cross-link peak list keeps a parallel "iso_peak_count" IntegerDataArray.
Returns
A OPXLDataStructs::PreprocessedPairSpectra with one entry per input pair in spectra_linear_peaks / spectra_xlink_peaks / spectra_all_peaks.

◆ run()

ExitCodes run ( PeakMap unprocessed_spectra,
ConsensusMap cfeatures,
std::vector< FASTAFile::FASTAEntry > &  fasta_db,
std::vector< ProteinIdentification > &  protein_ids,
PeptideIdentificationList peptide_ids,
OPXLDataStructs::PreprocessedPairSpectra preprocessed_pair_spectra,
std::vector< std::pair< Size, Size > > &  spectrum_pairs,
std::vector< std::vector< OPXLDataStructs::CrossLinkSpectrumMatch > > &  all_top_csms,
PeakMap spectra 
)

Performs the main function of this class, the search for cross-linked peptides.

Parameters
[in,out]unprocessed_spectraThe input PeakMap of experimental spectra
[in]cfeaturesConsensus features linking light and heavy mass pairs; e.g. created by FeatureFinderMultiplex
[in]fasta_dbThe protein database containing targets and decoys
[in,out]protein_idsA result vector containing search settings. Should contain one PeptideIdentification.
[out]peptide_idsA result vector containing cross-link spectrum matches as PeptideIdentifications and PeptideHits. Should be empty.
[out]preprocessed_pair_spectraA result structure containing linear and cross-linked ion spectra. Will be overwritten. This is only necessary for writing out xQuest type spectrum files.
[out]spectrum_pairsA result vector containing paired spectra indices. Should be empty. This is only necessary for writing out xQuest type spectrum files.
[out]all_top_csmsA result vector containing cross-link spectrum matches as CrossLinkSpectrumMatches. Should be empty. This is only necessary for writing out xQuest type spectrum files.
[out]spectraA result vector containing the input spectra after preprocessing and filtering. Should be empty. This is only necessary for writing out xQuest type spectrum files.

◆ updateMembers_()

void updateMembers_ ( )
overrideprivatevirtual

This method is used to update extra member variables at the end of the setParameters() method.

Also call it at the end of the derived classes' copy constructor and assignment operator.

The default implementation is empty.

Reimplemented from DefaultParamHandler.

Member Data Documentation

◆ add_a_ions_

std::string add_a_ions_
private

Cached value of parameter "ions:a_ions".

◆ add_b_ions_

std::string add_b_ions_
private

Cached value of parameter "ions:b_ions".

◆ add_c_ions_

std::string add_c_ions_
private

Cached value of parameter "ions:c_ions".

◆ add_losses_

std::string add_losses_
private

Cached value of parameter "ions:neutral_losses".

◆ add_x_ions_

std::string add_x_ions_
private

Cached value of parameter "ions:x_ions".

◆ add_y_ions_

std::string add_y_ions_
private

Cached value of parameter "ions:y_ions".

◆ add_z_ions_

std::string add_z_ions_
private

Cached value of parameter "ions:z_ions".

◆ cross_link_mass_iso_shift_

double cross_link_mass_iso_shift_
private

Cached value of parameter "cross_linker:mass_iso_shift" — mass difference heavy minus light.

◆ cross_link_mass_light_

double cross_link_mass_light_
private

Cached value of parameter "cross_linker:mass_light" — mass added by the light cross-linker.

◆ cross_link_mass_mono_link_

DoubleList cross_link_mass_mono_link_
private

Cached value of parameter "cross_linker:mass_mono_link" — possible mono-link masses.

◆ cross_link_name_

std::string cross_link_name_
private

Cached value of parameter "cross_linker:name" — used to disambiguate mass-equivalent linkers.

◆ cross_link_residue1_

StringList cross_link_residue1_
private

Cached value of parameter "cross_linker:residue1" — residues the first end of the linker reacts with.

◆ cross_link_residue2_

StringList cross_link_residue2_
private

Cached value of parameter "cross_linker:residue2" — residues the second end of the linker reacts with.

◆ decoy_prefix_

bool decoy_prefix_
private

Cached value of parameter "decoy_prefix"; if true the decoy string is matched as a prefix, otherwise as a suffix.

◆ decoy_string_

std::string decoy_string_
private

Cached value of parameter "decoy_string"; substring marking decoy entries in the FASTA accessions.

◆ deisotope_mode_

std::string deisotope_mode_
private

Cached value of parameter "algorithm:deisotope" ("true" / "false" / "auto")

◆ enzyme_name_

std::string enzyme_name_
private

Cached value of parameter "peptide:enzyme".

◆ fixedModNames_

StringList fixedModNames_
private

Cached value of parameter "modifications:fixed" (UniMod names); duplicates trigger ExitCodes::ILLEGAL_PARAMETERS.

◆ fragment_mass_tolerance_

double fragment_mass_tolerance_
private

Cached value of parameter "fragment:mass_tolerance" (linear fragment ions)

◆ fragment_mass_tolerance_unit_ppm_

bool fragment_mass_tolerance_unit_ppm_
private

Cached value of parameter "fragment:mass_tolerance_unit" == "ppm".

◆ fragment_mass_tolerance_xlinks_

double fragment_mass_tolerance_xlinks_
private

Cached value of parameter "fragment:mass_tolerance_xlinks" (cross-link-bearing fragment ions)

◆ max_precursor_charge_

Int max_precursor_charge_
private

Cached value of parameter "precursor:max_charge".

◆ max_variable_mods_per_peptide_

Size max_variable_mods_per_peptide_
private

Cached value of parameter "modifications:variable_max_per_peptide".

◆ min_precursor_charge_

Int min_precursor_charge_
private

Cached value of parameter "precursor:min_charge".

◆ missed_cleavages_

Size missed_cleavages_
private

Cached value of parameter "peptide:missed_cleavages".

◆ number_top_hits_

Int number_top_hits_
private

Cached value of parameter "algorithm:number_top_hits".

◆ peptide_min_size_

Size peptide_min_size_
private

Cached value of parameter "peptide:min_size".

◆ precursor_correction_steps_

IntList precursor_correction_steps_
private

Cached value of parameter "precursor:corrections" — monoisotopic-peak-misassignment offsets to try.

◆ precursor_mass_tolerance_

double precursor_mass_tolerance_
private

Cached value of parameter "precursor:mass_tolerance" (unit per precursor_mass_tolerance_unit_ppm_)

◆ precursor_mass_tolerance_unit_ppm_

bool precursor_mass_tolerance_unit_ppm_
private

Cached value of parameter "precursor:mass_tolerance_unit" == "ppm".

◆ varModNames_

StringList varModNames_
private

Cached value of parameter "modifications:variable" (UniMod names); duplicates trigger ExitCodes::ILLEGAL_PARAMETERS.