![]() |
OpenMS
|
Class for storing Percolator tab-delimited input files. More...
#include <OpenMS/FORMAT/PercolatorInfile.h>
Static Public Member Functions | |
| static void | store (const std::string &pin_file, const PeptideIdentificationList &peptide_ids, const StringList &feature_set, const std::string &enz, int min_charge, int max_charge) |
| static PeptideIdentificationList | load (const std::string &pin_file, bool higher_score_better, const std::string &score_name, const StringList &extra_scores, StringList &filenames, std::string decoy_prefix="", double threshold=0.01, bool SageAnnotation=false) |
| Loads peptide identifications from a Percolator input file. | |
| static std::string | getScanIdentifier (const PeptideIdentification &pid, size_t index) |
| static StringList | getStandardFeatureSet (int min_charge, int max_charge) |
| Returns the standard Percolator feature columns every .pin file should declare. | |
| static std::set< std::pair< size_t, size_t > > | stampPinFeaturesOnHits (PeptideIdentificationList &peptide_ids, const std::string &enz, int min_charge, int max_charge) |
| Compute and stamp PIN-equivalent meta values on every PeptideHit. | |
Static Protected Member Functions | |
| static TextFile | preparePin_ (const PeptideIdentificationList &peptide_ids, const StringList &feature_set, const std::string &enz, int min_charge, int max_charge) |
| static bool | isEnz_ (const char &n, const char &c, const std::string &enz) |
| static Size | countEnzymatic_ (const std::string &peptide, const std::string &enz) |
Class for storing Percolator tab-delimited input files.
|
staticprotected |
|
static |
|
static |
Returns the standard Percolator feature columns every .pin file should declare.
The list contains the three mandatory header columns (SpecId, Label, ScanNr) followed by the standard per-PSM features that preparePin_ computes and sets on every hit: ExpMass, CalcMass, mass, peplen, charge{min..max}, enzN, enzC, enzInt, dm, absdm. Callers should append their search-engine-specific extra_features (and finally Peptide, Proteins) to this list before calling store. This is the single source of truth used by PercolatorAdapter and any other tool that emits .pin for external percolator consumption.
|
staticprotected |
|
static |
Loads peptide identifications from a Percolator input file.
This function reads a Percolator input file (pin_file) and returns a vector of PeptideIdentification objects. It extracts relevantinformation such as peptide sequences, scores, charges, annotations, and protein accessions, applying specified thresholds and handling decoy targets as needed. Note: If a filename column is encountered the set of filenames is filled in the order of appearance and PeptideIdentifications annotated with the id_merge_index meta value to link them to the filename (similar to a merged idXML file).
| [in] | pin_file | he path to the Percolator input file with a .pin extension. |
| [in] | higher_score_better | A boolean flag indicating whether higher scores are considered better (true) or lower scores are better (false). |
| [in] | score_name | The name of the primary score to be used for ranking peptide hits. |
| [out] | extra_scores | A list of additional score names that should be extracted and stored in each PeptideHit. |
| [out] | filenames | Will be populated with the unique raw file names extracted from the input data. |
| [in] | decoy_prefix | The prefix used to identify decoy protein accessions. Proteins with accessions starting with this prefix are marked as decoys. Otherwise, it assumes that the pin file already contains the correctly annotated decoy status. |
| [in] | threshold | A double value representing the threshold for the spectrum_q value. Only spectra with spectrum_q below this threshold are processed. Implemented to allow prefiltering of Sage results. |
| [in] | SageAnnotation | A boolean value used to determine if the pin file is coming from Sage or not |
std::vector of PeptideIdentification objects containing the peptide identifications.| `Exception::ParseError` | if any line in the input file does not have the expected number of columns. TODO: implement something similar to PepXMLFile().setPreferredFixedModifications(getModifications_(fixed_modifications_names)); |
|
staticprotected |
|
static |
Compute and stamp PIN-equivalent meta values on every PeptideHit.
Runs the same per-hit computation that preparePin_ applies when writing a .pin file — but mutates the PeptideIdentifications in place instead of writing to a text file. After this call, each kept hit carries the full set of PIN meta values: SpecId, ScanNr, Label, CalcMass, ExpMass, deltamass, retentiontime, mass, score, peplen, charge1..chargeN, enzN, enzC, enzInt, dm, absdm, Peptide, Proteins.
Useful for in-process Percolator training (see OpenMS::Percolator): callers can then train on the exact same feature vectors the subprocess path would have seen via the .pin round-trip.
Hits with empty PeptideEvidences or UNKNOWN target/decoy status are left untouched; their (pid_index, hit_index) pairs are returned so callers know to skip them.
| peptide_ids | Mutated in place; each kept hit gets new meta values. |
| enz | Enzyme name (same values accepted as for store). |
| min_charge | Lower bound for the charge{N} one-hot features. |
| max_charge | Upper bound for the charge{N} one-hot features. |
|
static |