OpenMS
OPXLHelper Class Reference

The OPXLHelper class contains functions needed by OpenPepXL and OpenPepXLLF to reduce duplicated code. More...

#include <OpenMS/ANALYSIS/XLMS/OPXLHelper.h>

Classes

struct  PeptideIDScoreComparator
 A comparator for PeptideIdentifications that compares the scores in the first PeptideHit. More...
 

Static Public Member Functions

static std::vector< OPXLDataStructs::XLPrecursorenumerateCrossLinksAndMasses (const std::vector< OPXLDataStructs::AASeqWithMass > &peptides, double cross_link_mass_light, const DoubleList &cross_link_mass_mono_link, const StringList &cross_link_residue1, const StringList &cross_link_residue2, const std::vector< double > &spectrum_precursors, std::vector< int > &precursor_correction_positions, double precursor_mass_tolerance, bool precursor_mass_tolerance_unit_ppm)
 Enumerates precursor masses for all candidates in an XL-MS search. More...
 
static std::vector< OPXLDataStructs::AASeqWithMassdigestDatabase (std::vector< FASTAFile::FASTAEntry > fasta_db, const EnzymaticDigestion &digestor, Size min_peptide_length, const StringList &cross_link_residue1, const StringList &cross_link_residue2, const ModifiedPeptideGenerator::MapToResidueType &fixed_modifications, const ModifiedPeptideGenerator::MapToResidueType &variable_modifications, Size max_variable_mods_per_peptide)
 Digests a database with the given EnzymaticDigestion settings and precomputes masses for all peptides. More...
 
static std::vector< OPXLDataStructs::ProteinProteinCrossLinkbuildCandidates (const std::vector< OPXLDataStructs::XLPrecursor > &candidates, const std::vector< int > &precursor_corrections, const std::vector< int > &precursor_correction_positions, const std::vector< OPXLDataStructs::AASeqWithMass > &peptide_masses, const StringList &cross_link_residue1, const StringList &cross_link_residue2, double cross_link_mass, const DoubleList &cross_link_mass_mono_link, const std::vector< double > &spectrum_precursor_vector, const std::vector< double > &allowed_error_vector, const String &cross_link_name)
 Builds specific cross-link candidates with all possible combinations of linked positions from peptide pairs. Used to build candidates for the precursor mass window of a single MS2 spectrum. More...
 
static void buildFragmentAnnotations (std::vector< PeptideHit::PeakAnnotation > &frag_annotations, const std::vector< std::pair< Size, Size > > &matching, const PeakSpectrum &theoretical_spectrum, const PeakSpectrum &experiment_spectrum)
 Fills up the given FragmentAnnotation vector with annotations from a theoretical spectrum. More...
 
static void buildPeptideIDs (std::vector< PeptideIdentification > &peptide_ids, const std::vector< OPXLDataStructs::CrossLinkSpectrumMatch > &top_csms_spectrum, std::vector< std::vector< OPXLDataStructs::CrossLinkSpectrumMatch > > &all_top_csms, Size all_top_csms_current_index, const PeakMap &spectra, Size scan_index, Size scan_index_heavy)
 Builds PeptideIdentifications and PeptideHits. More...
 
static void addProteinPositionMetaValues (std::vector< PeptideIdentification > &peptide_ids)
 adds MetaValues for cross-link positions to PeptideHits More...
 
static void addXLTargetDecoyMV (std::vector< PeptideIdentification > &peptide_ids)
 adds xl_target_decoy MetaValue that combines alpha and beta target_decoy info More...
 
static void addBetaAccessions (std::vector< PeptideIdentification > &peptide_ids)
 adds accessions_beta MetaValue to alpha peptides for TOPPView visualization and CSV table output More...
 
static void removeBetaPeptideHits (std::vector< PeptideIdentification > &peptide_ids)
 removes beta peptides from cross-link IDs, since all info is already contained in the alpha peptide hits More...
 
static void addPercolatorFeatureList (ProteinIdentification &prot_id)
 adds the list of features that percolator should use for OpenPepXL More...
 
static void computeDeltaScores (std::vector< PeptideIdentification > &peptide_ids)
 sorts PeptideHits for each PeptideIdentification by score and adds the delta score as a MetaValue More...
 
static std::vector< PeptideIdentificationcombineTopRanksFromPairs (std::vector< PeptideIdentification > &peptide_ids, Size number_top_hits)
 combines all hits to spectrum pairs with the same light spectrum into one ranked list More...
 
static std::vector< OPXLDataStructs::ProteinProteinCrossLinkcollectPrecursorCandidates (const IntList &precursor_correction_steps, double precursor_mass, double precursor_mass_tolerance, bool precursor_mass_tolerance_unit_ppm, const std::vector< OPXLDataStructs::AASeqWithMass > &filtered_peptide_masses, double cross_link_mass, const DoubleList &cross_link_mass_mono_link, const StringList &cross_link_residue1, const StringList &cross_link_residue2, String cross_link_name, bool use_sequence_tags=false, const std::vector< std::string > &tags=std::vector< std::string >())
 Searches for cross-link candidates for a MS/MS spectrum. More...
 
static double computePrecursorError (const OPXLDataStructs::CrossLinkSpectrumMatch &csm, double precursor_mz, int precursor_charge)
 Computes the mass error of a precursor mass to a hit. More...
 
static void isoPeakMeans (OPXLDataStructs::CrossLinkSpectrumMatch &csm, DataArrays::IntegerDataArray &num_iso_peaks_array, std::vector< std::pair< Size, Size > > &matched_spec_linear_alpha, std::vector< std::pair< Size, Size > > &matched_spec_linear_beta, std::vector< std::pair< Size, Size > > &matched_spec_xlinks_alpha, std::vector< std::pair< Size, Size > > &matched_spec_xlinks_beta)
 Computes the mass error of a precursor mass to a hit. More...
 
static void filterPrecursorsByTags (std::vector< OPXLDataStructs::XLPrecursor > &candidates, std::vector< int > &precursor_correction_positions, const std::vector< std::string > &tags)
 Filters the list of candidates for cases that include at least one of the tags in at least one of the two sequences. More...
 

Detailed Description

The OPXLHelper class contains functions needed by OpenPepXL and OpenPepXLLF to reduce duplicated code.

Member Function Documentation

◆ addBetaAccessions()

static void addBetaAccessions ( std::vector< PeptideIdentification > &  peptide_ids)
static

adds accessions_beta MetaValue to alpha peptides for TOPPView visualization and CSV table output

Parameters
peptide_idsThe vector of peptide_ids containing XL-MS search results with alpha and beta PeptideHits, after mapping of peptides to proteins

◆ addPercolatorFeatureList()

static void addPercolatorFeatureList ( ProteinIdentification prot_id)
static

adds the list of features that percolator should use for OpenPepXL

Parameters
search_paramsThe search parameters of OpenPepXL

◆ addProteinPositionMetaValues()

static void addProteinPositionMetaValues ( std::vector< PeptideIdentification > &  peptide_ids)
static

adds MetaValues for cross-link positions to PeptideHits

Parameters
peptide_idsThe vector of peptide_ids containing XL-MS search results with alpha and beta PeptideHits, after mapping of peptides to proteins

◆ addXLTargetDecoyMV()

static void addXLTargetDecoyMV ( std::vector< PeptideIdentification > &  peptide_ids)
static

adds xl_target_decoy MetaValue that combines alpha and beta target_decoy info

Parameters
peptide_idsThe vector of peptide_ids containing XL-MS search results with alpha and beta PeptideHits, after mapping of peptides to proteins

◆ buildCandidates()

static std::vector<OPXLDataStructs::ProteinProteinCrossLink> buildCandidates ( const std::vector< OPXLDataStructs::XLPrecursor > &  candidates,
const std::vector< int > &  precursor_corrections,
const std::vector< int > &  precursor_correction_positions,
const std::vector< OPXLDataStructs::AASeqWithMass > &  peptide_masses,
const StringList cross_link_residue1,
const StringList cross_link_residue2,
double  cross_link_mass,
const DoubleList cross_link_mass_mono_link,
const std::vector< double > &  spectrum_precursor_vector,
const std::vector< double > &  allowed_error_vector,
const String cross_link_name 
)
static

Builds specific cross-link candidates with all possible combinations of linked positions from peptide pairs. Used to build candidates for the precursor mass window of a single MS2 spectrum.

Parameters
candidatesThe XLPrecursors containing indices of two peptides
peptide_massesThe digested peptide database, that the indices in the XLPrecursors refer to
cross_link_residue1A list of residues, to which the first side of the linker can react
cross_link_residue2A list of residues, to which the second side of the linker can react
cross_link_massmass of the cross-linker, only the light one if a labeled linker is used
cross_link_mass_mono_linkA list of possible masses for the cross-link, if it is attached to a peptide on one side
precursor_massThe precursor mass of the experimental spectrum (used to filter out certain candidates, e.g. mono- and loop-links have a different mass)
allowed_errorThe maximal precursor mass error in Da
cross_link_nameThe name of the cross-linker
n_term_linkerTrue, if the cross-linker can react with the N-terminal of a protein
c_term_linkerTrue, if the cross-linker can react with the C-terminal of a protein
Returns
A vector of ProteinProteinCrossLink candidates containing all necessary information to generate theoretical spectra

◆ buildFragmentAnnotations()

static void buildFragmentAnnotations ( std::vector< PeptideHit::PeakAnnotation > &  frag_annotations,
const std::vector< std::pair< Size, Size > > &  matching,
const PeakSpectrum theoretical_spectrum,
const PeakSpectrum experiment_spectrum 
)
static

Fills up the given FragmentAnnotation vector with annotations from a theoretical spectrum.

This function takes an alignment of a theoretical spectrum with meta information and an experimental spectrum and builds annotations taking the MZ and intensity values from the experimental spectrum and the ion names and charges from the theoretical spectrum to annotate matched experimental peaks.

Parameters
frag_annotationsThe vector to fill. Does not have to be empty, as annotations from several alignments can just be added on to the same vector.
matchingThe alignment between the two spectra
theoretical_spectrumThe theoretical spectrum with meta information
experiment_spectrumThe experimental spectrum

◆ buildPeptideIDs()

static void buildPeptideIDs ( std::vector< PeptideIdentification > &  peptide_ids,
const std::vector< OPXLDataStructs::CrossLinkSpectrumMatch > &  top_csms_spectrum,
std::vector< std::vector< OPXLDataStructs::CrossLinkSpectrumMatch > > &  all_top_csms,
Size  all_top_csms_current_index,
const PeakMap spectra,
Size  scan_index,
Size  scan_index_heavy 
)
static

Builds PeptideIdentifications and PeptideHits.

Parameters
peptide_idsThe vector of PeptideIdentifications for the whole experiment. The created PepIds will be pushed on this one.
top_csms_spectrumAll CrossLinkSpectrumMatches from the current spectrum to be written out
all_top_csmsA vector of all CrossLinkSpectrumMatches of the experiment, that is also extended in this function
all_top_csms_current_indexThe index of the current spectrum in all_top_csms (some spectra have no matches, so this is not equal to the spectrum index)
spectraThe searched spectra as a PeakMap
scan_indexThe index of the current spectrum
scan_index_heavyThe index of the heavy spectrum in a spectrum pair with labeled linkers

◆ collectPrecursorCandidates()

static std::vector<OPXLDataStructs::ProteinProteinCrossLink> collectPrecursorCandidates ( const IntList precursor_correction_steps,
double  precursor_mass,
double  precursor_mass_tolerance,
bool  precursor_mass_tolerance_unit_ppm,
const std::vector< OPXLDataStructs::AASeqWithMass > &  filtered_peptide_masses,
double  cross_link_mass,
const DoubleList cross_link_mass_mono_link,
const StringList cross_link_residue1,
const StringList cross_link_residue2,
String  cross_link_name,
bool  use_sequence_tags = false,
const std::vector< std::string > &  tags = std::vector< std::string >() 
)
static

Searches for cross-link candidates for a MS/MS spectrum.

This function uses enumerateCrossLinksAndMasses and buildCandidates to search for peptide pairs fitting to the given precursor mass_light and all considered precursor corrections.

Parameters
precursor_correction_stepsAn IntList of integers as indices of isotopic peaks around the experimental precursor
precursor_massThe decharged precursor mass
precursor_mass_toleranceThe precursor tolerance
precursor_mass_tolerance_unit_ppmThe unit of the precursor tolerance. "ppm" if true, "Da" if false
filtered_peptide_massesA vector of AASeqWithMass containing the sorted (ascending) peptide database with precomputed peptide masses
cross_link_massThe mass of the cross-linker (light mass, if labeled)
cross_link_mass_mono_linkA list of possible mono-link masses
cross_link_residue1A list of one-letter-code residues, that the first side of the cross-linker can attach to
cross_link_residue2A list of one-letter-code residues, that the second side of the cross-linker can attach to
cross_link_nameThe name of the cross-linker, e.g. "DSS" or "BS3"
use_sequence_tagsWhether to use sequence tags to filter out candidates
tagsThe list of sequence tags that are used to filter candidate sequences. Only applied if use_sequence_tags = true

◆ combineTopRanksFromPairs()

static std::vector< PeptideIdentification > combineTopRanksFromPairs ( std::vector< PeptideIdentification > &  peptide_ids,
Size  number_top_hits 
)
static

combines all hits to spectrum pairs with the same light spectrum into one ranked list

This function is a post-processing step for OpenPepXL with labeled linkers. This function collects PeptideIdentifications from all spectrum pairs with the same light spectrum, then resorts them by the score, makes them unique in case of equal candidates and reduces their number down to the chosen number of reported top hits.

Parameters
peptide_idsPeptideIdentifications from a Cross-Linking MS search with labeled linkers
number_top_hitsThe chosen number of reported top hits

◆ computeDeltaScores()

static void computeDeltaScores ( std::vector< PeptideIdentification > &  peptide_ids)
static

sorts PeptideHits for each PeptideIdentification by score and adds the delta score as a MetaValue

Parameters
peptide_idsThe vector of peptide_ids containing XL-MS search results without beta PeptideHits

◆ computePrecursorError()

static double computePrecursorError ( const OPXLDataStructs::CrossLinkSpectrumMatch csm,
double  precursor_mz,
int  precursor_charge 
)
static

Computes the mass error of a precursor mass to a hit.

Parameters
csmThe cross-link spectrum match containing the hit
precursor_mzThe precursor mz of the MS/MS spectrum
precursor_chargeThe charge of the precursor

◆ digestDatabase()

static std::vector<OPXLDataStructs::AASeqWithMass> digestDatabase ( std::vector< FASTAFile::FASTAEntry fasta_db,
const EnzymaticDigestion digestor,
Size  min_peptide_length,
const StringList cross_link_residue1,
const StringList cross_link_residue2,
const ModifiedPeptideGenerator::MapToResidueType fixed_modifications,
const ModifiedPeptideGenerator::MapToResidueType variable_modifications,
Size  max_variable_mods_per_peptide 
)
static

Digests a database with the given EnzymaticDigestion settings and precomputes masses for all peptides.

Also keeps track of the peptides at protein terminals and builds peptide candidates with all possible modification patterns according to the parameters.

Parameters
fasta_dbThe protein database
digestorThe object containing the digestion settings, e.g. the enzyme
min_peptide_lengthThe minimal peptide length for the digestion
cross_link_residue1A list of residues, to which the first side of the linker can react
cross_link_residue2A list of residues, to which the second side of the linker can react
fixed_modificationsThe list of fixed modifications
variable_modificationsThe list of variable modifications
max_variable_mods_per_peptideThe maximal number of variable modifications per peptide
count_proteinsA variable to keep track of the number of proteins in the database. Should be an externally declared variable and = 0 when calling this function.
count_peptidesA variable to keep track of the number of peptides after digestion. Should be an externally declared variable and = 0 when calling this function.
n_term_linkerTrue, if the cross-linker can react with the N-terminal of a protein
c_term_linkerTrue, if the cross-linker can react with the C-terminal of a protein
Returns
A vector of AASeqWithMass containing the peptides, their masses and information about terminal peptides

◆ enumerateCrossLinksAndMasses()

static std::vector<OPXLDataStructs::XLPrecursor> enumerateCrossLinksAndMasses ( const std::vector< OPXLDataStructs::AASeqWithMass > &  peptides,
double  cross_link_mass_light,
const DoubleList cross_link_mass_mono_link,
const StringList cross_link_residue1,
const StringList cross_link_residue2,
const std::vector< double > &  spectrum_precursors,
std::vector< int > &  precursor_correction_positions,
double  precursor_mass_tolerance,
bool  precursor_mass_tolerance_unit_ppm 
)
static

Enumerates precursor masses for all candidates in an XL-MS search.

Assumes the list of peptides and the list of spectrum precursor masses are sorted by mass in ascending order, and the list of mono-link masses is sorted in descending order.

Parameters
peptidesThe peptides with precomputed masses from the digestDatabase function
cross_link_mass_lightMass of the cross-linker, only the light one if a labeled linker is used
cross_link_mass_mono_linkA list of possible masses for the cross-link, if it is attached to a peptide on one side
cross_link_residue1A list of residues, to which the first side of the linker can react
cross_link_residue2A list of residues, to which the second side of the linker can react
spectrum_precursorsA vector of all MS2 precursor masses of the searched spectra. Used to filter out candidates.
precursor_correction_positionsA vector of the position of the used precursor correction
precursor_mass_toleranceThe precursor mass tolerance
precursor_mass_tolerance_unit_ppmThe unit of the precursor mass tolerance ("Da" or "ppm")
Returns
A vector of XLPrecursors containing all possible candidate cross-links

◆ filterPrecursorsByTags()

static void filterPrecursorsByTags ( std::vector< OPXLDataStructs::XLPrecursor > &  candidates,
std::vector< int > &  precursor_correction_positions,
const std::vector< std::string > &  tags 
)
static

Filters the list of candidates for cases that include at least one of the tags in at least one of the two sequences.

Parameters
candidatesThe list of XLPrecursors as enumerated by e.g. enumerateCrossLinksAndMasses
tagsThe list of tags for the current spectrum produced by the Tagger

◆ isoPeakMeans()

static void isoPeakMeans ( OPXLDataStructs::CrossLinkSpectrumMatch csm,
DataArrays::IntegerDataArray num_iso_peaks_array,
std::vector< std::pair< Size, Size > > &  matched_spec_linear_alpha,
std::vector< std::pair< Size, Size > > &  matched_spec_linear_beta,
std::vector< std::pair< Size, Size > > &  matched_spec_xlinks_alpha,
std::vector< std::pair< Size, Size > > &  matched_spec_xlinks_beta 
)
static

Computes the mass error of a precursor mass to a hit.

Parameters
csmThe cross-link spectrum match containing the hit
precursor_mzThe precursor mz of the MS/MS spectrum
precursor_chargeThe charge of the precursor

◆ removeBetaPeptideHits()

static void removeBetaPeptideHits ( std::vector< PeptideIdentification > &  peptide_ids)
static

removes beta peptides from cross-link IDs, since all info is already contained in the alpha peptide hits

Parameters
peptide_idsThe vector of peptide_ids containing XL-MS search results with alpha and beta PeptideHits