OpenMS
Loading...
Searching...
No Matches
ModifiedNASequenceGenerator Class Reference

Generator of modified nucleic-acid sequences for fixed and variable modification placement. More...

#include <OpenMS/CHEMISTRY/ModifiedNASequenceGenerator.h>

Public Types

using ConstRibonucleotidePtr = const Ribonucleotide *
 

Static Public Member Functions

static void applyFixedModifications (const std::set< ConstRibonucleotidePtr > &fixed_mods, NASequence &sequence)
 Apply all compatible fixed modifications from fixed_mods to sequence in place.
 
static void applyVariableModifications (const std::set< ConstRibonucleotidePtr > &var_mods, const NASequence &seq, Size max_variable_mods_per_NASequence, std::vector< NASequence > &all_modified_NASequences, bool keep_original=true)
 Enumerate all sequence variants obtained by combinatorially placing up to max_variable_mods_per_NASequence variable modifications.
 

Static Protected Member Functions

static void recurseAndGenerateVariableModifiedSequences_ (const std::vector< int > &subset_indices, const std::map< int, std::vector< ConstRibonucleotidePtr > > &map_compatibility, int depth, const NASequence &current_NASequence, std::vector< NASequence > &modified_NASequences)
 Recursive helper for applyVariableModifications — enumerate every assignment of compatible modifications to a fixed subset of sequence sites.
 
static void applyAtMostOneVariableModification_ (const std::set< ConstRibonucleotidePtr > &var_mods, const NASequence &seq, std::vector< NASequence > &all_modified_NASequences, bool keep_original=true)
 Fast specialisation of variable-modification placement for at most one modification per sequence.
 

Detailed Description

Generator of modified nucleic-acid sequences for fixed and variable modification placement.

Counterpart of ModifiedPeptideGenerator for NASequence. The class is a namespace in disguise: all entry points are static and operate either on a single sequence in place (fixed modifications) or by enumerating every legal combination of placements (variable modifications) into a caller-provided output vector.

Compatibility of a modification with a sequence position is determined by:

  • The residue's one-letter code matching the modification's origin character.
  • The modification's term-specificity (FIVE_PRIME / THREE_PRIME / ANYWHERE) being satisfied: ANYWHERE applies to any internal nucleotide, FIVE_PRIME / THREE_PRIME only at the corresponding sequence terminus.
  • The residue / terminal slot not already carrying a modification — pre-existing modifications are preserved and never overwritten.

Member Typedef Documentation

◆ ConstRibonucleotidePtr

Member Function Documentation

◆ applyAtMostOneVariableModification_()

static void applyAtMostOneVariableModification_ ( const std::set< ConstRibonucleotidePtr > &  var_mods,
const NASequence seq,
std::vector< NASequence > &  all_modified_NASequences,
bool  keep_original = true 
)
staticprotected

Fast specialisation of variable-modification placement for at most one modification per sequence.

Emits one variant per (site, compatible modification) pair: no combinations are enumerated. Already-modified residues are skipped. The unmodified sequence is emitted first when keep_original is true.

Parameters
[in]var_modsVariable modifications to consider.
[in]seqSource sequence; not modified.
[out]all_modified_NASequencesGenerated variants are appended here.
[in]keep_originalIf true, also emit seq unchanged before any modified variants.

◆ applyFixedModifications()

static void applyFixedModifications ( const std::set< ConstRibonucleotidePtr > &  fixed_mods,
NASequence sequence 
)
static

Apply all compatible fixed modifications from fixed_mods to sequence in place.

Two passes:

  • Terminal modifications (FIVE_PRIME / THREE_PRIME) are installed on the corresponding terminal slot only when that slot is still empty. Pre-existing terminal modifications are preserved.
  • Each nucleotide is then visited: residues already carrying a modification are skipped, and residues whose one-letter code matches the modification's origin receive that modification provided its term-specificity is ANYWHERE.

When multiple fixed modifications match the same residue, the iteration order of fixed_mods (a pointer-keyed std::set) decides which placement persists.

Parameters
[in]fixed_modsFixed modifications to apply.
[in,out]sequenceSequence modified in place.

Referenced by NucleicAcidSearchEngine::main_().

◆ applyVariableModifications()

static void applyVariableModifications ( const std::set< ConstRibonucleotidePtr > &  var_mods,
const NASequence seq,
Size  max_variable_mods_per_NASequence,
std::vector< NASequence > &  all_modified_NASequences,
bool  keep_original = true 
)
static

Enumerate all sequence variants obtained by combinatorially placing up to max_variable_mods_per_NASequence variable modifications.

Produces every legal combination of compatible variable modifications drawn from var_mods, with the number of placements per variant ranging from 1 to min(max_variable_mods_per_NASequence, N) where N is the number of compatible sites. 5' / 3' terminal modifications participate in the enumeration as virtual sites and are considered only when the corresponding terminal slot of seq is empty; nucleotides that are already modified are skipped during compatibility analysis.

The output all_modified_NASequences is appended to (existing entries are preserved). The unmodified seq is included in the appended block iff keep_original is true. If var_mods is empty, max_variable_mods_per_NASequence is 0, or no compatible sites are found, only the unmodified sequence is appended (and only when keep_original is true). A fast specialisation is used when max_variable_mods_per_NASequence equals 1.

Parameters
[in]var_modsVariable modifications to consider.
[in]seqSource sequence; not modified.
[in]max_variable_mods_per_NASequenceMaximum number of variable modifications per variant; 0 disables enumeration.
[out]all_modified_NASequencesGenerated variants are appended here.
[in]keep_originalIf true, also emit seq unchanged.

Referenced by NucleicAcidSearchEngine::main_().

◆ recurseAndGenerateVariableModifiedSequences_()

static void recurseAndGenerateVariableModifiedSequences_ ( const std::vector< int > &  subset_indices,
const std::map< int, std::vector< ConstRibonucleotidePtr > > &  map_compatibility,
int  depth,
const NASequence current_NASequence,
std::vector< NASequence > &  modified_NASequences 
)
staticprotected

Recursive helper for applyVariableModifications — enumerate every assignment of compatible modifications to a fixed subset of sequence sites.

For the position at subset_indices[depth], iterate over every modification listed in map_compatibility for that position, apply it to a copy of current_NASequence, and recurse with depth+1. When depth reaches subset_indices.size() the current sequence is appended to modified_NASequences. Sentinel indices -1 / -2 select the 5' / 3' terminal slot instead of an internal residue.

Parameters
[in]subset_indicesSelected positions (and 5'/3' sentinels) at which to place modifications.
[in]map_compatibilityPer-position list of compatible modifications.
[in]depthCurrent recursion depth (caller passes 0).
[in]current_NASequenceSequence accumulated so far.
[in,out]modified_NASequencesOutput vector to which fully enumerated variants are appended.