OpenMS
Loading...
Searching...
No Matches
FileConverter

Converts between different MS file formats.

pot. predecessor tools → FileConverter → pot. successor tools
any vendor software exporting supported formats (e.g. mzML) any tool operating on the output format

The main use of this tool is to convert data from external sources to the formats used by OpenMS/TOPP. Maybe most importantly, data from MS experiments in a number of different formats can be converted to mzML, the canonical file format used by OpenMS/TOPP for experimental data. (mzML is the PSI approved format and supports traceability of analysis steps.)

Thermo raw files can be converted to mzML using the ThermoRawFileParser provided in the THIRDPARTY folder. On windows, a recent .NET framwork needs to be installed. On linux and mac, the mono runtime needs to be present and accessible via the -NET_executable parameter. The path to the ThermoRawFileParser can be set via the -ThermoRaw_executable option.

For MaxQuant-flavoured mzXML the use of the advanced option '-force_MaxQuant_compatibility' is recommended.

Many different format conversions are supported, and some may be more useful than others. Depending on the file formats involved, information can be lost during conversion, e.g. when converting featureXML to mzData. In such cases a warning is shown.

The input and output file types are determined from the file extensions or from the first few lines of the files. If file type determination is not possible, the input or output file type has to be given explicitly.

Conversion with the same output as input format is supported. In some cases, this can be helpful to remove errors from files (e.g. the index), to update file formats to new versions, or to check whether information is lost upon reading or writing.

Some information about the supported input types: mzML mzXML mzData mgf msp dta2d dta featureXML consensusXML featureparquet (OpenMS internal feature map parquet bundle) consensusparquet (OpenMS internal consensus map parquet bundle) ms2 fid/XMASS tsv peplist kroenik edta sqmass oms

Note
See IDFileConverter for similar functionality for protein/peptide identification file formats.

The command line parameters of this tool are:

FileConverter -- Converts between different MS file formats.
Full documentation: http://www.openms.de/doxygen/nightly/html/TOPP_FileConverter.html
Version: 3.6.0-pre-nightly-2026-06-11 Jun 11 2026, 09:57:53, Revision: fc66052
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.

Usage:
  FileConverter <options>

Options (mandatory options marked with '*'):
  -in <file>*        Input file to convert. (valid formats: 'mzML', 'mzXML', 'mgf', 'msp', 'raw', 'cachedMzML
                     ', 'mzData', 'dta', 'dta2d', 'featureXML', 'consensusXML', 'featureparquet', 'consensusp
                     arquet', 'ms2', 'fid', 'd', 'tsv', 'peplist', 'kroenik', 'edta', 'oms', 'sqMass')
  -in_type <type>    Input file type -- default: determined from file extension or content
                      (valid: 'mzML', 'mzXML', 'mgf', 'msp', 'raw', 'cachedMzML', 'mzData', 'dta', 'dta2d', 
                     'featureXML', 'consensusXML', 'featureparquet', 'consensusparquet', 'ms2', 'fid', 'd', 
                     'tsv', 'peplist', 'kroenik', 'edta', 'oms', 'sqMass')
  -out <file>*       Output file (valid formats: 'mzML', 'mzXML', 'cachedMzML', 'mgf', 'msp', 'featureXML', 
                     'consensusXML', 'featureparquet', 'consensusparquet', 'edta', 'mzData', 'dta2d', 'csv', 
                     'sqMass', 'xic', 'oms')
  -out_type <type>   Output file type -- default: determined from file extension or content
                     Note: that not all conversion paths work or make sense. (valid: 'mzML', 'mzXML', 'cached
                     MzML', 'mgf', 'msp', 'featureXML', 'consensusXML', 'featureparquet', 'consensusparquet',
                      'edta', 'mzData', 'dta2d', 'csv', 'sqMass', 'xic', 'oms')
                     
Common TOPP options:
  -ini <file>        Use the given TOPP INI file
  -threads <n>       Sets the number of threads allowed to be used by the TOPP tool (0 = all available cores)
                      (default: '1')
  -write_ini <file>  Writes the default configuration file
  --help             Shows options
  --helphelp         Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter

This section lists all parameters supported by the tool. Parameters are organized into hierarchical subsections that group related settings together. Subsections may contain further subsections or individual parameters.

Each parameter entry contains the following information:

  • Name The identifier used in configuration files and on the command line.
  • Default value The value used if the parameter is not explicitly specified.
  • Description A short explanation describing the purpose and behavior of the parameter.
  • Tags Additional metadata associated with the parameter.
  • Restrictions Allowed value ranges for numeric parameters or valid options for string parameters.

Parameter tags provide additional information about how a parameter is used. Some tags indicate whether a parameter is required or intended for advanced configuration, while others may be used internally by OpenMS or workflow tools.

Parameters highlighted as required must be specified for the tool to run successfully. Parameters marked as advanced allow fine-tuning of algorithm behavior and are typically not needed for standard workflows.

+FileConverterConverts between different MS file formats.
version3.6.0-pre-nightly-2026-06-11 Version of the tool that generated this parameters file.
++1Instance '1' section for 'FileConverter'
in Input file to convert.input file*.mzML, *.mzXML, *.mgf, *.msp, *.raw, *.cachedMzML, *.mzData, *.dta, *.dta2d, *.featureXML, *.consensusXML, *.featureparquet, *.consensusparquet, *.ms2, *.fid, *.d, *.tsv, *.peplist, *.kroenik, *.edta, *.oms, *.sqMass
in_type Input file type -- default: determined from file extension or content
mzML, mzXML, mgf, msp, raw, cachedMzML, mzData, dta, dta2d, featureXML, consensusXML, featureparquet, consensusparquet, ms2, fid, d, tsv, peplist, kroenik, edta, oms, sqMass
UID_postprocessingensure unique ID post-processing for output data.
'none' keeps current IDs even if invalid.
'ensure' keeps current IDs but reassigns invalid ones.
'reassign' assigns new unique IDs.
none, ensure, reassign
out Output fileoutput file*.mzML, *.mzXML, *.cachedMzML, *.mgf, *.msp, *.featureXML, *.consensusXML, *.featureparquet, *.consensusparquet, *.edta, *.mzData, *.dta2d, *.csv, *.sqMass, *.xic, *.oms
out_type Output file type -- default: determined from file extension or content
Note: that not all conversion paths work or make sense.
mzML, mzXML, cachedMzML, mgf, msp, featureXML, consensusXML, featureparquet, consensusparquet, edta, mzData, dta2d, csv, sqMass, xic, oms
TIC_DTA2Dfalse Export the TIC instead of the entire experiment in mzML/mzData/mzXML -> DTA2D conversions.true, false
MGF_compactfalse Use a more compact format when writing MGF (no zero-intensity peaks, limited number of decimal places)true, false
force_MaxQuant_compatibilityfalse [mzXML output only] Make sure that MaxQuant can read the mzXML and set the msManufacturer to 'Thermo Scientific'.true, false
force_TPP_compatibilityfalse [mzML output only] Make sure that TPP parsers can read the mzML and the precursor ion m/z in the file (otherwise it will be set to zero by the TPP).true, false
convert_to_chromatogramsfalse [mzML output only] Assumes that the provided spectra represent data in SRM mode or targeted MS1 mode and converts them to chromatogram data.true, false
write_scan_indextrue Append an index when writing mzML or mzXML files. Some external tools might rely on it.true, false
lossy_compressionfalse Use numpress compression to achieve optimally small file size using linear compression for m/z domain and slof for intensity and float data arrays (attention: may cause small loss of precision; only for mzML data).true, false
lossy_mass_accuracy-1.0 Desired (absolute) m/z accuracy for lossy compression (e.g. use 0.0001 for a mass accuracy of 0.2 ppm at 500 m/z, default uses -1.0 for maximal accuracy).
process_lowmemoryfalse Whether to process the file on the fly without loading the whole file into memory first (only for conversions of mzXML/mzML to mzML).
Note: this flag will prevent conversion from spectra to chromatograms.
true, false
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool (0 = all available cores)
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++brukerOptions for reading Bruker TimsTOF .d files (requires WITH_OPENTIMS)
calibration_tolerance0.0 m/z recalibration tolerance (0 = library default)0.0:∞
calibratefalse Enable m/z recalibration (may fail on some datasets)true, false
load_ms1true Load MS1 spectra. Disable for MS2-only workflows (peptide database search) where MS1 surveys are not needed — substantially cuts memory and time. Affects all export modes.true, false
export_modeauto Export mode: 'auto' detects DDA/DIA acquisition type, 'spectrum' forces per-precursor MS2 spectra (DDA-style), 'frame' returns raw 4D frames without signal processing.auto, spectrum, frame
ms1_centroid_mz_ppm10.0 MS1 m/z linking tolerance in ppm. HillBased default 10 ppm is tuned for detector-centroided TIMS-PASEF MS1: real ions drift up to ~10 ppm in m/z between consecutive IM scans, so 5 ppm under-links — empirically only ~4% of MS1 hills end up multi-scan at 5 ppm vs ~9% at 10 ppm. Greedy2D also uses this and additionally requires ms1_centroid_im_pct > 0.0.0:∞
ms1_centroid_im_pct0.0 MS1 frame IM-centroiding ion mobility tolerance in percent. Both this and ms1_centroid_mz_ppm must be > 0 to enable. Suggested value: 3.0.0.0:∞
dia_ms2_n_neighbors0 DIA MS2 frame aggregation: number of adjacent frames on each side to sum per SWATH window. 0 = disabled (raw per-frame export), 1 = 3-frame sum, 2 = 5-frame sum. This switches the entire DIA-MS2 export pipeline (sum + denoise) and applies regardless of ms2_centroid_algo. For the DIA-PASEF hill recipe, set to 2 together with ms2_centroid_algo=hillbased + ms2_centroid_min_hill_length=2.0:∞
dia_ms2_min_support1 DIA MS2 denoising: minimum occupied neighbor cells in a 3x3 (m/z x IM) grid to keep a point (center cell excluded from count). Applied after frame aggregation. Only effective when dia_ms2_n_neighbors > 0. Set to 0 to disable denoising (useful for pure centroiding without noise filtering).0:∞
dia_ms2_centroidfalse Apply 2D Gaussian smoothing + local maxima peak picking to the denoised DIA MS2 grid. Produces IM_CENTROIDED spectra with sub-bin (m/z, IM) precision. Only effective when dia_ms2_n_neighbors > 0.true, false
ms1_n_neighbors0 MS1 frame aggregation: number of adjacent MS1 frames on each side to sum. 0 = disabled (raw export), 1 = 3-frame sum, 2 = 5-frame sum. Applies to both DIA and DDA; ignored in FRAME export mode.0:50
ms1_min_support0 MS1 denoising: minimum occupied neighbor cells in a 3x3 (m/z x IM) grid to keep a point. Applied after aggregation. 0 = disabled, 8 = all 8 neighbors required (strictest). Only effective when ms1_n_neighbors > 0. Appropriate for dense survey runs; disable for rare-species discovery.0:8
ms1_max_rt_distance_sec0.0 Cap the RT distance (seconds) between a neighbor MS1 frame and the center frame during aggregation. 0.0 = no cap. Recommended for DDA (e.g. 5.0) where MS1 frame cadence is irregular. The center frame is always included regardless of this cap.0.0:∞
ms1_centroid_max_peaks100000 Cap on the number of centroided peaks retained per MS1 spectrum. Top-intensity peaks are kept; low-intensity tail is dropped if the limit is hit (a warning is logged in that case). Only effective when MS1 centroiding is enabled via ms1_centroid_mz_ppm/pct. Raise for aggregated MS1 (ms1_n_neighbors > 0) on dense surveys; lower to trim long-tail noise.1:∞
ms1_centroid_algooff MS1 centroiding algorithm. 'off' = no IM-axis centroiding. 'greedy2d' = legacy 2D (m/z, IM) box clustering using ms1_centroid_mz_ppm/pct. 'hillbased' = IM-axis hill detection using ms1_centroid_mz_ppm + centroid_valley_factor + ms1_centroid_min_hill_length (modeled on Biosaur2). When 'off', the legacy combination ms1_centroid_mz_ppm > 0 + ms1_centroid_im_pct > 0 implies 'greedy2d' (back-compat).off, greedy2d, hillbased
ms2_centroid_algooff MS2 centroiding algorithm (DIA-PASEF + DDA-PASEF). 'off' = no MS2 centroiding (DIA emits raw IM_PEAK, DDA uses TOF-domain processing). 'greedy2d' = DIA-MS2 Gaussian smoothing + local maxima (requires dia_ms2_n_neighbors > 0; DDA: same as 'off'). 'hillbased' = IM-axis hill detection — works on both DDA-MS2 and DIA-MS2, including DIA at dia_ms2_n_neighbors=0 (per-frame hill linking, no cross-RT summing). Takes precedence over the legacy dia_ms2_centroid boolean.off, greedy2d, hillbased
ms2_centroid_mz_ppm20.0 HillBased DIA-/DDA-MS2 m/z linking tolerance in ppm. Required (>0) when ms2_centroid_algo=hillbased. Default 20.0 is DIA-PASEF-tuned for fragments.0.0:∞
centroid_valley_factor1.3 HillBased: hill valley factor (hvf). A hill is split at a valley only if both (left_max/valley) and (right_max/valley) exceed this value. Smaller = more aggressive splitting. Default 1.3 matches Biosaur2.1.0:∞
ms1_centroid_min_hill_length1 HillBased MS1: minimum number of IM scans a hill must span. Default 1 keeps single-IM-scan ions (common on detector-centroided TIMS-PASEF MS1: ~75% of peaks have no same-m/z partner in the previous IM scan within 100 ppm).1:∞
ms2_centroid_min_hill_length2 HillBased MS2: minimum number of IM scans a hill must span. Default 2 is DIA-PASEF-tuned: rejects single-scan singletons (~67% of unfiltered hill output) and brings volume close to the legacy Gaussian-smooth + local-maxima path. DDA-PASEF users should override to 1 (narrow precursor IM range → most fragments seen in only one IM scan, min=2 drops ~93% of DDA fragment peaks).1:∞
centroid_max_scan_gap0 HillBased (MS1 + MS2): maximum number of consecutive empty IM scans a hill may bridge while linking. 0 = strict consecutive-scan linking (Biosaur2 default). 1 = a single empty scan at the hill's m/z is tolerated; useful on detector-centroided TIMS-PASEF where ions occasionally fail to register in one IM scan. Hill length still counts only the scans where the ion was actually observed, not the bridged gap.0:∞
isotopic_prefilterfalse MS1 + DIA-MS2 isotopic-partner prefilter applied after aggregation (or after raw extraction otherwise), before the centroider dispatch. Drops peaks that lack at least one isotopic partner at m/z ± C13C12_MASSDIFF / q (q in {1..5}) within ± bruker:isotopic_prefilter_tol_ppm AND |Δscan_id| <= 1. Cleans up isolated detector-noise singletons; preserves both the monoisotopic peak and the isotopologue (mutual evidence). Pure existence check — no intensity/averagine model. Not applied to DDA-MS2 (no per-peak IM array). Off by default.true, false
isotopic_prefilter_tol_ppm50.0 ppm tolerance for isotopic-partner matching by the prefilter. Mass-relative, so the absolute Da window scales with m/z (50 ppm ≈ 0.01 Da at m/z 200, 0.05 Da at m/z 1000). Broad by design so per-scan calibration jitter doesn't drop real partners. Only effective when bruker:isotopic_prefilter is true.0.0:∞
expose_hill_boundsfalse HillBased (MS1 + DIA-MS2): attach four extra FloatDataArrays per centroided spectrum ('im lower bound', 'im upper bound', 'm/z lower bound', 'm/z upper bound') giving each centroid's source-hill bounding box. Useful for visual QC of centroiding (e.g. with tools/scripts/plot_pasef_frames.py --show-hill-bounds). Bloats centroided mzML by roughly +25%. No effect on DDA-MS2 hill (scalar drift_time schema).true, false
+++RawToMzMLOptions for converting raw files to mzML (uses ThermoRawFileParser)
NET_executable The .NET framework executable. Only required on linux and mac.input file, is_executable
ThermoRaw_executableThermoRawFileParser.exe The ThermoRawFileParser executable.input file, is_executable*.exe
no_peak_pickingfalse Disables vendor peak picking for raw files.true, false
no_zlib_compressionfalse Disables zlib compression for raw file conversion. Enables compatibility with some tools that do not support compressed input files, e.g. X!Tandem.true, false
include_noisefalse Include noise data in mzML output.true, false
readerexternal Reader for Thermo .raw files. 'external' uses ThermoRawFileParser (external .NET process, mzML output only); 'inprocess' uses the built-in ThermoRawFile (in-process, supports any output format; requires WITH_THERMO_RAW build).external
+++OpenSwathWorkflowOptions for loading OpenSWATH transition libraries used for chromatogram metadata
tr Transition library (PQP, TSV, or TraML) providing precursor/transition metadata. Required when converting sqMass to CHROMPARQUET (.xic); XICs without associated metadata are not meaningful.input file*.pqp, *.tsv, *.traml, *.osw
tr_type Type hint for the transition file (pqp, tsv, traml). If not provided, the type is inferred from the file extension.
legacy_traml_idfalse When loading PQP libraries: use legacy TraML IDs (TRAML_ID) instead of numeric IDs.true, false