![]() |
OpenMS
|
Generator of modified nucleic-acid sequences for fixed and variable modification placement. More...
#include <OpenMS/CHEMISTRY/ModifiedNASequenceGenerator.h>
Public Types | |
| using | ConstRibonucleotidePtr = const Ribonucleotide * |
Static Public Member Functions | |
| static void | applyFixedModifications (const std::set< ConstRibonucleotidePtr > &fixed_mods, NASequence &sequence) |
Apply all compatible fixed modifications from fixed_mods to sequence in place. | |
| static void | applyVariableModifications (const std::set< ConstRibonucleotidePtr > &var_mods, const NASequence &seq, Size max_variable_mods_per_NASequence, std::vector< NASequence > &all_modified_NASequences, bool keep_original=true) |
Enumerate all sequence variants obtained by combinatorially placing up to max_variable_mods_per_NASequence variable modifications. | |
Static Protected Member Functions | |
| static void | recurseAndGenerateVariableModifiedSequences_ (const std::vector< int > &subset_indices, const std::map< int, std::vector< ConstRibonucleotidePtr > > &map_compatibility, int depth, const NASequence ¤t_NASequence, std::vector< NASequence > &modified_NASequences) |
| Recursive helper for applyVariableModifications — enumerate every assignment of compatible modifications to a fixed subset of sequence sites. | |
| static void | applyAtMostOneVariableModification_ (const std::set< ConstRibonucleotidePtr > &var_mods, const NASequence &seq, std::vector< NASequence > &all_modified_NASequences, bool keep_original=true) |
| Fast specialisation of variable-modification placement for at most one modification per sequence. | |
Generator of modified nucleic-acid sequences for fixed and variable modification placement.
Counterpart of ModifiedPeptideGenerator for NASequence. The class is a namespace in disguise: all entry points are static and operate either on a single sequence in place (fixed modifications) or by enumerating every legal combination of placements (variable modifications) into a caller-provided output vector.
Compatibility of a modification with a sequence position is determined by:
origin character.FIVE_PRIME / THREE_PRIME / ANYWHERE) being satisfied: ANYWHERE applies to any internal nucleotide, FIVE_PRIME / THREE_PRIME only at the corresponding sequence terminus.| using ConstRibonucleotidePtr = const Ribonucleotide* |
|
staticprotected |
Fast specialisation of variable-modification placement for at most one modification per sequence.
Emits one variant per (site, compatible modification) pair: no combinations are enumerated. Already-modified residues are skipped. The unmodified sequence is emitted first when keep_original is true.
| [in] | var_mods | Variable modifications to consider. |
| [in] | seq | Source sequence; not modified. |
| [out] | all_modified_NASequences | Generated variants are appended here. |
| [in] | keep_original | If true, also emit seq unchanged before any modified variants. |
|
static |
Apply all compatible fixed modifications from fixed_mods to sequence in place.
Two passes:
FIVE_PRIME / THREE_PRIME) are installed on the corresponding terminal slot only when that slot is still empty. Pre-existing terminal modifications are preserved.origin receive that modification provided its term-specificity is ANYWHERE.When multiple fixed modifications match the same residue, the iteration order of fixed_mods (a pointer-keyed std::set) decides which placement persists.
| [in] | fixed_mods | Fixed modifications to apply. |
| [in,out] | sequence | Sequence modified in place. |
Referenced by NucleicAcidSearchEngine::main_().
|
static |
Enumerate all sequence variants obtained by combinatorially placing up to max_variable_mods_per_NASequence variable modifications.
Produces every legal combination of compatible variable modifications drawn from var_mods, with the number of placements per variant ranging from 1 to min(max_variable_mods_per_NASequence, N) where N is the number of compatible sites. 5' / 3' terminal modifications participate in the enumeration as virtual sites and are considered only when the corresponding terminal slot of seq is empty; nucleotides that are already modified are skipped during compatibility analysis.
The output all_modified_NASequences is appended to (existing entries are preserved). The unmodified seq is included in the appended block iff keep_original is true. If var_mods is empty, max_variable_mods_per_NASequence is 0, or no compatible sites are found, only the unmodified sequence is appended (and only when keep_original is true). A fast specialisation is used when max_variable_mods_per_NASequence equals 1.
| [in] | var_mods | Variable modifications to consider. |
| [in] | seq | Source sequence; not modified. |
| [in] | max_variable_mods_per_NASequence | Maximum number of variable modifications per variant; 0 disables enumeration. |
| [out] | all_modified_NASequences | Generated variants are appended here. |
| [in] | keep_original | If true, also emit seq unchanged. |
Referenced by NucleicAcidSearchEngine::main_().
|
staticprotected |
Recursive helper for applyVariableModifications — enumerate every assignment of compatible modifications to a fixed subset of sequence sites.
For the position at subset_indices[depth], iterate over every modification listed in map_compatibility for that position, apply it to a copy of current_NASequence, and recurse with depth+1. When depth reaches subset_indices.size() the current sequence is appended to modified_NASequences. Sentinel indices -1 / -2 select the 5' / 3' terminal slot instead of an internal residue.
| [in] | subset_indices | Selected positions (and 5'/3' sentinels) at which to place modifications. |
| [in] | map_compatibility | Per-position list of compatible modifications. |
| [in] | depth | Current recursion depth (caller passes 0). |
| [in] | current_NASequence | Sequence accumulated so far. |
| [in,out] | modified_NASequences | Output vector to which fully enumerated variants are appended. |