OpenMS
Loading...
Searching...
No Matches
Chemistry

Classes

class  AAIndex
 Representation of selected AAIndex properties. More...
 
class  AASequence
 Representation of a peptide/protein sequence. More...
 
class  DigestionEnzyme
 Base class for digestion enzymes. More...
 
class  DigestionEnzymeDB< DigestionEnzymeType, InstanceType >
 Digestion enzyme database (base class) More...
 
class  DigestionEnzymeProtein
 Representation of a digestion enzyme for proteins (protease) More...
 
class  DigestionEnzymeRNA
 Representation of a digestion enzyme for RNA (RNase) More...
 
class  Element
 Representation of an element. More...
 
class  ElementDB
 Singleton that stores elements and isotopes. More...
 
class  EmpiricalFormula
 Representation of an empirical formula. More...
 
class  EnzymaticDigestion
 Class for the enzymatic digestion of sequences. More...
 
class  CoarseIsotopePatternGenerator
 Isotope pattern generator for coarse isotope distributions. More...
 
class  FineIsotopePatternGenerator
 Isotope pattern generator for fine isotope distributions. More...
 
class  ModificationDefinition
 
class  ModificationDefinitionsSet
 
class  ModificationsDB
 database which holds all residue modifications from UniMod More...
 
class  MonosaccharideDB
 Singleton database of monosaccharides for glycan notation. More...
 
class  NASequence
 Representation of a nucleic acid sequence. More...
 
class  NucleicAcidSpectrumGenerator
 Generates theoretical spectra for nucleic acid sequences. More...
 
struct  ConversionIssue
 Description of a conversion issue from Peptidoform to AASequence. More...
 
struct  CvAccession
 Controlled vocabulary accession for a modification. More...
 
struct  NamedMod
 Named modification with optional CV prefix hint. More...
 
struct  MassDelta
 Mass delta modification with optional source hint. More...
 
struct  FormulaTag
 Chemical formula with optional charge. More...
 
struct  GlycanComposition
 Glycan composition specification. More...
 
struct  InfoTag
 Info tag for arbitrary text annotations. More...
 
struct  PositionConstraint
 Position constraint specifying allowed residues for a modification. More...
 
struct  Label
 Label for cross-links, branches, or ambiguous grouping. More...
 
struct  Modification
 A modification with one or more alternative tags. More...
 
struct  SequenceElement
 A single amino acid with its modifications. More...
 
struct  AmbiguousRegion
 Ambiguous amino acid region. More...
 
struct  ModifiedRange
 Modified sequence range with shared modifications. More...
 
struct  UnlocalisedMod
 Unlocalised modification with optional occurrence count. More...
 
struct  LabileModification
 Labile modification that may be lost during fragmentation. More...
 
struct  GlobalModification
 Global modification applied to specific locations. More...
 
struct  IsotopeReplacement
 Isotope replacement for stable isotope labeling. More...
 
struct  AdductIon
 Adduct ion specification for charge state. More...
 
struct  Peptidoform
 A single peptidoform (one peptide chain) More...
 
struct  PeptidoformIon
 A peptidoform ion (one or more chains with optional charge) More...
 
struct  CrossLinkGroup
 Cross-link group connecting sites across chains. More...
 
class  ProFormaParseError
 Structured parse error with context for ProForma parsing. More...
 
class  ProFormaParser
 Recursive descent parser for ProForma v2 peptidoform notation. More...
 
class  ProFormaTokenizer
 Tokenizer for ProForma v2 peptidoform notation. More...
 
class  ProFormaWriter
 Writer for ProForma v2 peptidoform notation. More...
 
class  ProteaseDB
 Database for enzymes that digest proteins (proteases) More...
 
class  ProteaseDigestion
 Class for the enzymatic digestion of proteins represented as AASequence or String. More...
 
class  Residue
 Representation of an amino acid residue. More...
 
class  ResidueDB
 OpenMS stores a central database of all residues in the ResidueDB. All (unmodified) residues are added to the database on construction. Modified residues get created and added if getModifiedResidue is called. More...
 
class  Ribonucleotide
 Representation of a ribonucleotide (modified or unmodified) More...
 
class  RibonucleotideDB
 Database of ribonucleotides (modified and unmodified) More...
 
class  RNaseDB
 Database for enzymes that digest RNA (RNases) More...
 
class  RNaseDigestion
 Class for the enzymatic digestion of RNAs. More...
 
class  SimpleTSGXLMS
 Generates theoretical spectra for cross-linked peptides. More...
 
class  SpectrumAnnotator
 Annotates spectra from identifications and theoretical spectra or identifications from spectra and theoretical spectra matching with various options. More...
 
class  TheoreticalSpectrumGenerator
 Generates theoretical spectra for peptides with various options. More...
 

Typedefs

using ModificationTag = std::variant< CvAccession, NamedMod, MassDelta, FormulaTag, GlycanComposition, InfoTag, PositionConstraint >
 Variant type representing any modification tag content.
 
using SequenceSection = std::variant< SequenceElement, AmbiguousRegion, ModifiedRange >
 Variant type representing a section of the sequence.
 
using GlobalModEntry = std::variant< IsotopeReplacement, GlobalModification >
 Variant type for global modification entries.
 
using ChargeState = std::variant< int, std::vector< AdductIon > >
 Charge state specification.
 

Enumerations

enum class  AASequenceConversionPolicy { FAIL_ON_LOSS , DROP_UNLOCALISED , BEST_EFFORT }
 Conversion policy for transforming Peptidoform to AASequence. More...
 
enum class  ConversionIssueType {
  UNRESOLVED_MOD , UNLOCALISED_MOD , LABILE_MOD , GLOBAL_MOD ,
  AMBIGUOUS_MOD , AMBIGUOUS_REGION , MODIFIED_RANGE , CROSS_LINK ,
  MULTIPLE_CHAINS , ALTERNATIVE_MODS , UNSUPPORTED_FEATURE
}
 Issue type for AASequence conversion problems. More...
 
enum class  ProFormaWriteMode { LOSSLESS , CANONICAL }
 Write mode for ProForma string serialization. More...
 
enum class  CvDatabase {
  UNIMOD , MOD , RESID , XLMOD ,
  GNO
}
 Controlled vocabulary database prefix for modification accessions. More...
 
enum class  ProFormaErrorCode {
  UNEXPECTED_CHARACTER , UNCLOSED_BRACKET , UNMATCHED_BRACKET , INVALID_CV_PREFIX ,
  INVALID_CV_ACCESSION , INVALID_AMINO_ACID , INVALID_MASS_VALUE , INVALID_FORMULA ,
  UNKNOWN_MONOSACCHARIDE , DANGLING_CROSSLINK_LABEL , EMPTY_SEQUENCE , INVALID_CHARGE ,
  INVALID_OCCURRENCE_SPECIFIER , UNEXPECTED_END_OF_INPUT , INTERNAL_ERROR
}
 Error codes for programmatic handling of ProForma parse errors. More...
 

Detailed Description


Class Documentation

◆ OpenMS::ConversionIssue

struct OpenMS::ConversionIssue

Description of a conversion issue from Peptidoform to AASequence.

Records problems encountered when attempting to convert a ProForma Peptidoform to an OpenMS AASequence representation.

Collaboration diagram for ConversionIssue:
[legend]
Class Members
String description Human-readable description.
size_t position Position in sequence (SIZE_MAX if not position-specific)
ConversionIssueType type The type of issue.

◆ OpenMS::CvAccession

struct OpenMS::CvAccession

Controlled vocabulary accession for a modification.

Represents a modification specified by a CV accession number, e.g., UNIMOD:35 for Oxidation. The accession string contains only the identifier portion (e.g., "35" for UNIMOD:35).

Collaboration diagram for CvAccession:
[legend]
Class Members
String accession The accession identifier (e.g., "35" for UNIMOD:35, full string for GNO)
CvDatabase database The source database (UNIMOD, MOD, RESID, XLMOD, or GNO)

◆ OpenMS::NamedMod

struct OpenMS::NamedMod

Named modification with optional CV prefix hint.

Represents a modification specified by name, optionally with a CV prefix hint to disambiguate which database to search (e.g., "U:Oxidation" for UniMod, "M:Oxidation" for PSI-MOD).

Collaboration diagram for NamedMod:
[legend]
Class Members
optional< CvDatabase > cv_hint Optional CV prefix hint (U, M, R, X, G)
String name The modification name (e.g., "Oxidation", "Phospho")

◆ OpenMS::FormulaTag

struct OpenMS::FormulaTag

Chemical formula with optional charge.

Represents a modification specified by chemical formula. The optional charge is specified via the :z+N suffix in ProForma (e.g., Formula:C12H20O2:z+2).

Collaboration diagram for FormulaTag:
[legend]
Class Members
optional< int > charge Optional charge from :z+N suffix.
String formula_string The chemical formula string (e.g., "C12H20O2")

◆ OpenMS::GlycanComposition

struct OpenMS::GlycanComposition

Glycan composition specification.

Represents a glycan modification as a composition of monosaccharides. Each component can be either a named monosaccharide (e.g., "Hex", "HexNAc") or a custom formula specification.

Example: Glycan:HexNAc1Hex2 -> [(HexNAc, 1), (Hex, 2)]

Collaboration diagram for GlycanComposition:
[legend]
Class Members
typedef variant< String, FormulaTag > Monosaccharide A monosaccharide component: either a name (String) or a custom formula (FormulaTag)
Class Members
vector< pair< Monosaccharide, int > > components List of (monosaccharide, count) pairs.

◆ OpenMS::InfoTag

struct OpenMS::InfoTag

Info tag for arbitrary text annotations.

Represents an INFO: tag in ProForma that carries arbitrary text metadata about a modification or site. Example: INFO:provenance_data

Collaboration diagram for InfoTag:
[legend]
Class Members
String text The info text content.

◆ OpenMS::PositionConstraint

struct OpenMS::PositionConstraint

Position constraint specifying allowed residues for a modification.

Represents a Position: tag in ProForma that constrains where a modification can be localized. This is typically used as an alternative to a modification to indicate its possible sites.

Example: [Oxidation|Position:M] means Oxidation can only occur at M residues

Collaboration diagram for PositionConstraint:
[legend]
Class Members
bool c_term = false True if modification can be at C-terminus.
bool n_term = false True if modification can be at N-terminus.
vector< char > residues List of allowed amino acid residues.

◆ OpenMS::Modification

struct OpenMS::Modification

A modification with one or more alternative tags.

In ProForma, a modification can have multiple alternatives separated by |, representing uncertainty about the exact modification. Each alternative consists of a tag and an optional label.

Example: K[Phospho|+79.97] has two alternatives

The resolved_mod field is populated by resolveModifications() and points to the ResidueModification in ModificationsDB (for the first/primary alternative).

Collaboration diagram for Modification:
[legend]
Class Members
vector< pair< ModificationTag, optional< Label > > > alternatives Each alternative is a (tag, optional_label) pair.
const ResidueModification * resolved_mod = nullptr

Resolved modification pointer (populated by resolveModifications) Points to the ResidueModification for the first alternative, if found

◆ OpenMS::SequenceElement

struct OpenMS::SequenceElement

A single amino acid with its modifications.

Represents one position in the peptide sequence: the amino acid residue and zero or more modifications attached to it.

Collaboration diagram for SequenceElement:
[legend]
Class Members
char amino_acid Single-letter amino acid code (A-Z)
vector< Modification > modifications Modifications at this position.

◆ OpenMS::AmbiguousRegion

struct OpenMS::AmbiguousRegion

Ambiguous amino acid region.

Represents a region where the amino acid sequence is uncertain. ProForma notation: (?DQ) means either D or Q at this position.

Collaboration diagram for AmbiguousRegion:
[legend]
Class Members
vector< SequenceElement > elements The ambiguous amino acid possibilities.

◆ OpenMS::ModifiedRange

struct OpenMS::ModifiedRange

Modified sequence range with shared modifications.

Represents a subsequence where one or more modifications apply to the entire range, but the exact position is uncertain.

ProForma notation: (EOSFORMS)[+19.0523] means +19.0523 applies somewhere in EOSFORMS

Collaboration diagram for ModifiedRange:
[legend]
Class Members
vector< SequenceElement > elements The amino acids in the range.
vector< Modification > modifications Modifications applying to the entire range.

◆ OpenMS::UnlocalisedMod

struct OpenMS::UnlocalisedMod

Unlocalised modification with optional occurrence count.

Represents a modification that is known to exist on the peptide but whose exact position is unknown. The occurrence specifies how many instances of this modification are present.

ProForma notation: [Phospho]?PEPTIDE or [Phospho]^2?PEPTIDE

Collaboration diagram for UnlocalisedMod:
[legend]
Class Members
vector< Modification > modifications The unlocalised modification(s)
optional< int > occurrence Optional occurrence count from ^N suffix.

◆ OpenMS::LabileModification

struct OpenMS::LabileModification

Labile modification that may be lost during fragmentation.

Labile modifications are typically lost during ionization or fragmentation and thus may not be observed in MS/MS spectra.

ProForma notation: {Glycan:Hex}PEPTIDE

Collaboration diagram for LabileModification:
[legend]
Class Members
Modification modification The labile modification.

◆ OpenMS::GlobalModification

struct OpenMS::GlobalModification

Global modification applied to specific locations.

A global modification applies the same modification to all occurrences of specified residues or termini throughout the peptide.

ProForma notation: <[TMT6plex]@K,N-term>

Collaboration diagram for GlobalModification:
[legend]
Class Members
vector< String > locations Target locations ("K", "N-term", "C-term:K", etc.)
Modification modification The modification to apply.

◆ OpenMS::IsotopeReplacement

struct OpenMS::IsotopeReplacement

Isotope replacement for stable isotope labeling.

Represents global replacement of an element with a specific isotope, used for stable isotope labeling experiments.

ProForma notation: <13C> or <15N> or <D>

Collaboration diagram for IsotopeReplacement:
[legend]
Class Members
String isotope The isotope specification (e.g., "13C", "15N", "D")

◆ OpenMS::AdductIon

struct OpenMS::AdductIon

Adduct ion specification for charge state.

Represents an adduct ion contributing to the charge state of a peptidoform ion. Multiple adducts can combine to give the total charge.

ProForma notation: Na:z+1 in /[Na:z+1,H:z+1]

Collaboration diagram for AdductIon:
[legend]
Class Members
int charge The charge contribution of this adduct.
String formula The adduct formula (e.g., "Na", "H", "K")
optional< int > occurrence Optional occurrence count from ^N suffix.

◆ OpenMS::Peptidoform

struct OpenMS::Peptidoform

A single peptidoform (one peptide chain)

Represents a complete peptide chain including:

  • Optional name identifier (from v2.1 extension)
  • Global modifications (<13C>, <[TMT6plex]@K>)
  • Unlocalised modifications ([Phospho]?)
  • Labile modifications ({Glycan:Hex})
  • N-terminal modifications ([Acetyl]-)
  • The amino acid sequence with modifications
  • C-terminal modifications (-[Amidated])
Collaboration diagram for Peptidoform:
[legend]
Class Members
vector< Modification > c_term_mods C-terminal modifications: -[Amidated].
optional< ChargeState > charge Optional per-chain charge (for chimeric spectra)
vector< GlobalModEntry > global_mods Global modifications: <13C>, <[TMT6plex]@K>
vector< LabileModification > labile_mods Labile modifications: {Glycan:Hex}.
vector< Modification > n_term_mods N-terminal modifications: [Acetyl]-.
optional< String > name Optional name from (>name) v2.1 extension.
vector< SequenceSection > sequence The sequence with modifications.
vector< UnlocalisedMod > unlocalised_mods Unlocalised modifications: [Phospho]?

◆ OpenMS::PeptidoformIon

struct OpenMS::PeptidoformIon

A peptidoform ion (one or more chains with optional charge)

Represents one or more peptide chains that form a single ion species. Multiple chains can be present in cross-linked or multi-chain entities.

ProForma notation: chains are separated by //

Collaboration diagram for PeptidoformIon:
[legend]
Class Members
vector< Peptidoform > chains One or more peptide chains (separated by // or + in ProForma)
optional< ChargeState > charge Optional charge state specification.
bool is_chimeric = false True if chains are chimeric (+ separator), false if cross-linked (//)
optional< String > name Optional name from (>>name) v2.1 extension.

◆ OpenMS::CrossLinkGroup

struct OpenMS::CrossLinkGroup

Cross-link group connecting sites across chains.

Groups together all sites that share a cross-link label. Each site is identified by its chain index and position within that chain.

Derived during parsing from matching #XL labels.

Collaboration diagram for CrossLinkGroup:
[legend]
Class Members
String label The cross-link label (e.g., XL1)
vector< pair< size_t, size_t > > sites (chain_index, site_index) pairs

Typedef Documentation

◆ ChargeState

using ChargeState = std::variant< int, std::vector<AdductIon> >

Charge state specification.

The charge state can be specified as either:

  • A simple integer charge (/2, /+2, /-1)
  • A list of adduct ions (/[Na:z+1,H:z+1])

◆ GlobalModEntry

Variant type for global modification entries.

A GlobalModEntry can be either:

◆ ModificationTag

Variant type representing any modification tag content.

A ModificationTag can be one of:

  • CvAccession: CV database accession (e.g., UNIMOD:35)
  • NamedMod: Named modification with optional CV hint (e.g., Oxidation, U:Oxidation)
  • MassDelta: Mass difference with optional source (e.g., +15.9949, Obs:+79.978)
  • FormulaTag: Chemical formula (e.g., Formula:C12H20O2)
  • GlycanComposition: Glycan composition (e.g., Glycan:HexNAc1Hex2)
  • InfoTag: Info annotation (e.g., INFO:comment)
  • PositionConstraint: Allowed residue positions (e.g., Position:MKC)

◆ SequenceSection

Variant type representing a section of the sequence.

A SequenceSection can be:

Enumeration Type Documentation

◆ AASequenceConversionPolicy

enum class AASequenceConversionPolicy
strong

Conversion policy for transforming Peptidoform to AASequence.

Controls how the conversion handles modifications that cannot be directly represented in AASequence (e.g., unlocalised, labile, or ambiguous modifications).

Enumerator
FAIL_ON_LOSS 

Fail if any modification cannot be fully represented.

DROP_UNLOCALISED 

Drop unlocalised, labile, and global modifications.

BEST_EFFORT 

Try to convert as much as possible, skip unsupported.

◆ ConversionIssueType

enum class ConversionIssueType
strong

Issue type for AASequence conversion problems.

Enumerator
UNRESOLVED_MOD 

Modification could not be found in ModificationsDB.

UNLOCALISED_MOD 

Modification has no specific position.

LABILE_MOD 

Labile modification (lost during fragmentation)

GLOBAL_MOD 

Global modification (applies to multiple sites)

AMBIGUOUS_MOD 

Ambiguously localized modification.

AMBIGUOUS_REGION 

Ambiguous amino acid region.

MODIFIED_RANGE 

Modified range (position uncertain)

CROSS_LINK 

Cross-link between chains.

MULTIPLE_CHAINS 

Multiple peptide chains.

ALTERNATIVE_MODS 

Multiple alternative modifications (|)

UNSUPPORTED_FEATURE 

Other unsupported ProForma feature.

◆ CvDatabase

enum class CvDatabase
strong

Controlled vocabulary database prefix for modification accessions.

Identifies the source database for a modification accession in ProForma notation. Examples: UNIMOD:35, MOD:00046, XLMOD:02001, GNO:G59626AS

Enumerator
UNIMOD 

UniMod database (https://www.unimod.org/)

MOD 

PSI-MOD ontology (https://www.ebi.ac.uk/ols/ontologies/mod)

RESID 

RESID database.

XLMOD 

Cross-linking modifications ontology.

GNO 

Glycan naming ontology.

◆ ProFormaErrorCode

enum class ProFormaErrorCode
strong

Error codes for programmatic handling of ProForma parse errors.

These error codes provide machine-readable categorization of parsing failures, enabling downstream code to handle specific error types appropriately.

Enumerator
UNEXPECTED_CHARACTER 

Unexpected character encountered during parsing.

UNCLOSED_BRACKET 

Opening bracket without matching close bracket.

UNMATCHED_BRACKET 

Closing bracket without matching open bracket.

INVALID_CV_PREFIX 

Invalid controlled vocabulary prefix (e.g., not UNIMOD, MOD, etc.)

INVALID_CV_ACCESSION 

Invalid CV accession number format.

INVALID_AMINO_ACID 

Invalid amino acid one-letter code.

INVALID_MASS_VALUE 

Invalid mass value format or value.

INVALID_FORMULA 

Invalid chemical formula.

UNKNOWN_MONOSACCHARIDE 

Unknown monosaccharide abbreviation.

DANGLING_CROSSLINK_LABEL 

Crosslink label without a matching partner.

EMPTY_SEQUENCE 

Empty sequence provided.

INVALID_CHARGE 

Invalid charge state specification.

INVALID_OCCURRENCE_SPECIFIER 

Invalid occurrence specifier (e.g., ^2)

UNEXPECTED_END_OF_INPUT 

Unexpected end of input string.

INTERNAL_ERROR 

Internal parser error (should not occur)

◆ ProFormaWriteMode

enum class ProFormaWriteMode
strong

Write mode for ProForma string serialization.

Controls whether the output preserves original formatting (LOSSLESS) or produces a normalized, deterministic output (CANONICAL).

Enumerator
LOSSLESS 

Preserve original spelling/formatting where possible (e.g., mass delta text)

CANONICAL 

Normalized output: uppercase CV prefixes, sorted mods, 4 decimal places for masses.