OpenMS  2.4.0
Digestor

Digests a protein database in-silico.

pot. predecessor tools $ \longrightarrow $ Digestor $ \longrightarrow $ pot. successor tools
none (FASTA input) IDFilter (peptide blacklist)

This application is used to digest a protein database to get all peptides given a cleavage enzyme.

The output can be used e.g. as a blacklist filter input to IDFilter, to remove certain peptides.

Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

Digestor -- Digests a protein database in-silico.
Version: 2.4.0-HEAD-2019-01-18 Jan 18 2019, 21:06:42, Revision: 8ddd6a9
To cite OpenMS:
  Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  Digestor <options>

Options (mandatory options marked with '*'):
  -in <file>*                  Input file (valid formats: 'fasta')
  -out <file>*                 Output file (peptides) (valid formats: 'idXML', 'fasta')
  -out_type <type>             Set this if you cannot control the filename of 'out', e.g., in TOPPAS. (valid:
                               'idXML', 'fasta')
  -missed_cleavages <number>   The number of allowed missed cleavages (default: '1' min: '0')
  -min_length <number>         Minimum length of peptide (default: '6')
  -max_length <number>         Maximum length of peptide (default: '40')
  -enzyme <string>             The type of digestion enzyme (default: 'Trypsin' valid: 'Trypsin', 'Arg-C/P', 
                               'proline-endopeptidase/HKR', 'Glu-C+P', 'PepsinA + P', 'cyanogen-bromide',
                               'Clostripain/P', 'no cleavage', 'V8-E', 'Asp-N/B', 'unspecific cleavage', 'Chy
                               motrypsin', 'iodosobenzoate', 'staphylococcal protease/D', 'leukocyte elastase
                               ', 'Formic_acid', 'CNBr', 'proline endopeptidase', 'Lys-C', 'Arg-C', 'Trypsin/
                               P', 'Asp-N', 'Lys-N', '2-iodobenzoate', 'Lys-C/P', 'PepsinA', 'V8-DE', 'Alpha-
                               lytic protease', 'glutamyl endopeptidase', 'elastase-trypsin-chymotrypsin',
                               'Asp-N_ambic', 'TrypChymo', 'Chymotrypsin/P')

Options for FASTA output files:
  -FASTA:ID <option>           Identifier to use for each peptide: copy from parent protein (parent); a conse
                               cutive number (number); parent ID + consecutive number (both) (default: 'paren
                               t' valid: 'parent', 'number', 'both')
  -FASTA:description <option>  Keep or remove the (possibly lengthy) FASTA header description. Keeping it 
                               can increase resulting FASTA file significantly. (default: 'remove' valid:
                               'remove', 'keep')

                               
Common UTIL options:
  -ini <file>                  Use the given TOPP INI file
  -threads <n>                 Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>            Writes the default configuration file
  --help                       Shows options
  --helphelp                   Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+DigestorDigests a protein database in-silico.
version2.4.0-HEAD-2019-01-18 Version of the tool that generated this parameters file.
++1Instance '1' section for 'Digestor'
in input fileinput file*.fasta
out Output file (peptides)output file*.idXML,*.fasta
out_type Set this if you cannot control the filename of 'out', e.g., in TOPPAS.idXML,fasta
missed_cleavages1 The number of allowed missed cleavages0:∞
min_length6 Minimum length of peptide
max_length40 Maximum length of peptide
enzymeTrypsin The type of digestion enzymeTrypsin,Arg-C/P,proline-endopeptidase/HKR,Glu-C+P,PepsinA + P,cyanogen-bromide,Clostripain/P,no cleavage,V8-E,Asp-N/B,unspecific cleavage,Chymotrypsin,iodosobenzoate,staphylococcal protease/D,leukocyte elastase,Formic_acid,CNBr,proline endopeptidase,Lys-C,Arg-C,Trypsin/P,Asp-N,Lys-N,2-iodobenzoate,Lys-C/P,PepsinA,V8-DE,Alpha-lytic protease,glutamyl endopeptidase,elastase-trypsin-chymotrypsin,Asp-N_ambic,TrypChymo,Chymotrypsin/P
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue,false
forcefalse Overwrite tool specific checks.true,false
testfalse Enables the test mode (needed for internal use only)true,false
+++FASTAOptions for FASTA output files
IDparent Identifier to use for each peptide: copy from parent protein (parent); a consecutive number (number); parent ID + consecutive number (both)parent,number,both
descriptionremove Keep or remove the (possibly lengthy) FASTA header description. Keeping it can increase resulting FASTA file significantly.remove,keep