OpenMS
Loading...
Searching...
No Matches
DatabaseFilter

The DatabaseFilter tool filters a protein database in fasta format according to one or multiple filtering criteria.

The resulting database is written as output. Depending on the reporting method (method="whitelist" or "blacklist") only entries are retained that passed all filters ("whitelist) or failed at least one filter ("blacklist").

Implemented filter criteria:

accession: Filter database according to the set of protein accessions contained in an identification file (idXML, mzIdentML)

The command line parameters of this tool are:

DatabaseFilter -- Filters a protein database (FASTA format) based on identified proteins
Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_DatabaseFilter.html
Version: 3.6.0-pre-nightly-2026-03-06 Mar  7 2026, 01:46:19, Revision: c92c980
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.

Usage:
  DatabaseFilter <options>

Options (mandatory options marked with '*'):
  -in <file>*        Input FASTA file, containing a protein database. (valid formats: 'fasta')
  -id <file>*        Input file containing identified peptides and proteins. (valid formats: 'idXML', 'mzid')

  -method <choice>   Switch between white-/blacklisting of protein IDs (default: 'whitelist') (valid: 'whitel
                     ist', 'blacklist')
  -out <file>*       Output FASTA file where the reduced database will be written to. (valid formats: 'fasta'
                     )
                     
Common TOPP options:
  -ini <file>        Use the given TOPP INI file
  -threads <n>       Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>  Writes the default configuration file
  --help             Shows options
  --helphelp         Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter

This section lists all parameters supported by the tool. Parameters are organized into hierarchical subsections that group related settings together. Subsections may contain further subsections or individual parameters.

Each parameter entry contains the following information:

  • Name The identifier used in configuration files and on the command line.
  • Default value The value used if the parameter is not explicitly specified.
  • Description A short explanation describing the purpose and behavior of the parameter.
  • Tags Additional metadata associated with the parameter.
  • Restrictions Allowed value ranges for numeric parameters or valid options for string parameters.

Parameter tags provide additional information about how a parameter is used. Some tags indicate whether a parameter is required or intended for advanced configuration, while others may be used internally by OpenMS or workflow tools.

Parameters highlighted as required must be specified for the tool to run successfully. Parameters marked as advanced allow fine-tuning of algorithm behavior and are typically not needed for standard workflows.

+DatabaseFilterFilters a protein database (FASTA format) based on identified proteins
version3.6.0-pre-nightly-2026-03-06 Version of the tool that generated this parameters file.
++1Instance '1' section for 'DatabaseFilter'
in Input FASTA file, containing a protein database.input file*.fasta
id Input file containing identified peptides and proteins.input file*.idXML, *.mzid
methodwhitelist Switch between white-/blacklisting of protein IDswhitelist, blacklist
out Output FASTA file where the reduced database will be written to.output file*.fasta
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false