OpenMS
Coding Conventions

Use the following code conventions when contributing to OpenMS.

OpenMS uses coding conventions that are automatically checked using cpplint (/src/tests/coding/cpplint.py), when ENABLE_STYLE_TESTING flag is 'ON' during CMake.

When developing in an IDE which support Clang format you can use the our style preset from the source tree OpenMS/.clang-format. For Clion, you can import it by selecting Preferences > Code Style > Manage. VS2017 and later also support Clang format natively (press Ctrl-K, Ctrl-D).

Formatting and style

The following section focuses on formatting and style.

Indentation

Use two spaces to indent. Tabulators are not allowed.

Spaces

Use spaces after built-in key words (e.g. for, if, else, etc.), and before and after binary mathematical operators, e.g. 1 + 3 not 1+3.

Line endings

Unix line endings are used on each platform (see <OpenMS>/.gitattributes) to enable using a single source tree on a network drive or WLS with multi-OS clients.

Bracket placements

Matching pairs of opening and closing curly braces should be set to the same column. See the following example:

while (continue)
{
for (int i = 0; i < 10; ++i)
{
...
}
if (x < 7)
{
....
}
}

The main reason for this rule is to avoid constructions like:

if (isValid(a))
return 0;

that might later be changed to something like and introduce a bug:

if (isValid(a))
has_error = false;
return 0; // bug: will always return

Thus, use braces around a block even for a single line.

Single line constructs for trivial cases like:

if (test) continue;
bool test
Status of the current subsection.

are allowed.

Naming conventions

The following section describes the naming conventions followed by OpenMS developers.

File names

Header files and source files should have the same name as the classes they contain. Source files end in .cpp, while header files end in .h. File names should be capitalised exactly as the class they contain (see below). Each header/source file should contain one class only, although exceptions are possible for light-weight classes.

Underscores

The usage of underscores in names has two different meanings: A trailing "_" at the end indicates that something is protected or private to a class (a data member or a member function). Apart from that, different parts of a name are sometimes separated by an underscore, and sometimes separated by capital letters.

Note
According to the C++ standard, names that start with an underscore are reserved for internal purposes of the language and its standard library, so you should never use them.

Classes/Types/Namespaces

Class names and type names always start with a capital letter. Different parts of the name are separated by capital letters at the beginning of the word. No underscores are allowed in type names and class names, except for the names of protected types and classes in classes, which are suffixed by an underscore. The same conventions apply for namespaces.

Here is an example of some classes written using the conventions described above:

class Simple; //ordinary class
class SimpleThing; //ordinary class
class PDBFile; //using an abbreviation
class Buffer_; //protected or private nested class
class ForwardIteratorTraits_; //protected or private nested class

Method names

Function names (including class method names) always start with a lower case letter. Parts of the name are separated using capital letters (as are types and class names). They should be comprehensible, but as short as possible. The same variable names must be used in the declaration and in the definition. Arguments that are part of the interface (e.g. by inheritance), but actually not used in the implementation of a function have to be commented out - this avoids compiler warnings about unused variables. The argument of void functions (empty argument list) must be omitted in both the declaration and the definition. If function arguments are pointers or references, the pointer or reference qualifier is appended to the variable type. The pointer or reference qualifier should not prefix the variable name.

Here is an example of some method names written using the conventions described above:

void hello(); //ordinary function, no arguments
int countPeaks(PeakArray const& p); //ordinary function
bool ignore(string& name); //ordinary function with an unused argument in the implementation (see below). Leave the variable name here, if the function parameter is documented using Doxygen
bool ignore(string& /* name */) {}; //ordinary function with an unused argument
bool isAdjacentTo(Peak const* const* const& p) const; //an ordinary function
bool doSomething_(int i, string& name); //protected or private member function

Variable names

Variable names are written in lower case letters. Distinguished parts of the name are separated using underscores. If parts of the name are derived from common acronyms (e.g. MS) they should be in upper case. Private or protected member variables of classes are suffixed by an underscore.

Here is an example of some variable names written using the conventions described above:

int simple; //ordinary variable
bool is_found; //ordinary variable
string MS_instrument; //using an abbreviation
int counter_; //protected or private member
int persistent_id_; //protected or private member

Enum and preprocessor constants

Enumerated values and preprocessor constants are all upper case letters. Parts of the name are separated by underscores.

Here is an example of some enumerated values and preprocessor constants written using the conventions described above:

#define MYCLASS_SUPPORTS_MIN_MAX 0 //preprocessor constant
enum class DimensionId { DIM_MZ = 0, DIM_RT = 1 }; //enumerated values
enum class DimensionId_ { MZ = 0, RT = 1 }; //enumerated values
@ RT
RT in seconds.

Avoid using the preprocessor. Normally, const and enum class will suffice for most cases. Avoid enum and prefer enum class.

Parameters

Parameters should consist of lower-case letters and underscores only. For numerical parameters, the range of reasonable values is given. Where applicable units are given in the description. This rule applies to all kinds of parameter strings, both keys and string-values.

File extensions

The correct capitalization of all data file extensions supported by OpenMS is documented in FileHandler::NamesOfTypes[]. The convention is to use only lowercase letters for file extensions. There are three exceptions: "ML" and "XML" are written in uppercase letters and "mzData" keeps its capital "D". Remember to keep this consistent when adding new data files or writing new TOPP tools (use correct capitalization for file type restrictions, here).

Classes

The following section outlines the class requirements with examples.

Example class files

In OpenMS, every .h file must be accompanied by a .cpp file, even if is just a ''dummy''. This way a global make will stumble across errors.

Here is an example of a correctly structured .h file:

// Copyright (c) 2002-present, OpenMS Inc. -- EKU Tuebingen, ETH Zurich, and FU Berlin
// SPDX-License-Identifier: BSD-3-Clause
//
// --------------------------------------------------------------------------
// $Maintainer: Heinz Erhardt $
// $Authors: Heinz Erhardt $
// --------------------------------------------------------------------------
#pragma once
#include <functional>
#include <sstream>
namespace OpenMS
{
... the actual code goes here ...
} // namespace OpenMS
Main OpenMS namespace.
Definition: openswathalgo/include/OpenMS/OPENSWATHALGO/DATAACCESS/ISpectrumAccess.h:19

Here is an example of a correctly structured .cpp file:

// Copyright (c) 2002-present, OpenMS Inc. -- EKU Tuebingen, ETH Zurich, and FU Berlin
// SPDX-License-Identifier: BSD-3-Clause
//
// --------------------------------------------------------------------------
// $Maintainer: Heinz Erhardt $
// $Authors: Heinz Erhardt $
// --------------------------------------------------------------------------
namespace OpenMS
{
... the actual code goes here ...
} // namespace OpenMS

Remember that the definition of a class or function template has to be known at its point of instantiation. Therefore, the implementation of a template is normally contained in the .h file. For template classes, declaration and definition are given in the same file. Things get more complicated when certain design patterns (e.g., the factory pattern) are used which lead to "circular dependencies". This is only a dependency of names, but it has to be resolved by separating declarations from definitions, at least for some of the member functions. In this case, a .h file can be written that contains most of the definitions as well as the declarations of the peculiar functions. Their definition is deferred to the _impl.h file ("impl" for "implementation"). The _impl.h file is included only if the peculiar member functions have to be instantiated. Otherwise the .h file should be sufficient. No .h file should include an _impl.h file.

General rules

The following section discusses rules around the use of primitives, namespaces, accessors to members and the STL.

Primitive types

OpenMS uses its own type names for primitive types. Use only the types defined in OpenMS/include/OpenMS/CONCEPT/Types.h.

Namespaces

The main OpenMS classes are implemented in the namespace OpenMS. Auxiliary classes are implemented in OpenMS::Internal. There are some other namespaces e.g. for constants and exceptions.

Importing a whole namespace in a header files is forbidden. For example:

// sample.h
using namespace std; //< Don't do this at home!

Using the directive on C++ standard library datatypes in header files is forbidden. For example:

// sample.h
using std::vector; //< Don't do this shorthand!
void sampleFunction1(vector &v1); //< bad: Shorthand leads to a confusing datatype. Don't do this.
void sampleFunction2(std::vector &v1); //< good: Full namespacing of datatype prevents confusion. Be explicit.

This could lead to name clashes when OpenMS is used together with other libraries. In source files (.cpp) it is however allowed.

/ Rule-of-6

In general, follow the Rule-of-0 or Rule-of-6, when implementing any of the default operations, sometimes called special functions, i.e. constructor, destructor, copy assignment operator etc.

Accessors to members

Accessors to protected or private members of a class are implemented as a pair of get-method and set-method. This is necessary as accessors that return mutable references to a member cannot be wrapped with Python.

class Test
{
public:
// always implement a non-mutable get-method
UInt getMember() const
{
return member_;
}
// always implement a set-method
void setMember(UInt name)
{
member_ = name;
}
protected:
UInt member_;
};
unsigned int UInt
Unsigned integer type.
Definition: Types.h:64

For members that are too large to be read with the get-method or modified and written back with the set-method, an additional non-const get-method returning a reference can be implemented.

For primitive types, using a get-method which returns a reference is strictly forbidden. For more complex types it should be present only when necessary.

class Test
{
public:
const vector<String>& getMember() const
{
return member_;
}
void setMember(const vector<String>& name)
{
member_ = name;
}
// if absolutely necessary implement a mutable get-method
vector<String>& getMember()
{
return member_;
}
protected:
vector<String> member_;
};

Exceptions

The following section describes how to handle exceptions and create exception classes.

Exception handling

No OpenMS program should dump a core if an error occurs. Instead, it should attempt to die as gracefully as possible. Furthermore, as OpenMS is a framework rather than an application, it should give the programmer ways to catch and correct errors. The recommended procedure to handle - even fatal - errors is to throw an exception. Uncaught exception will result in a call to abort thereby terminating the program.

Throw exceptions

To simplify debugging, use the following throw directive for exceptions:

throw AnyException(__FILE__, __LINE__, OPENMS_PRETTY_FUNCTION);

FILE and LINE are standard-defined preprocessor macros. The macro OPENMS_PRETTY_FUNCTION wraps Boost's version of a platform independent PRETTY_FUNCTION macro, that works similar to a char* and contains the type signature of the function as well as its bare name, if the GNU compiler is being used. It might differ on other platforms. Exception::Base provides methods (getFile, getLine, getFunction) that allow the localisation of the exception's cause.

Catch exceptions

The standard way to catch an exception should be by reference (and not by value), as shown below:

try
{
// some code which might throw
}
catch (Exception& e)
{
// Handle the exception, then possibly re-throw it:
throw; // the modified e
}

Specify exceptions

Potential exceptions must be documented to tell the user which exceptions can be caught.

void myFunction()
{
throw Foo(__FILE__, __LINE__, OPENMS_PRETTY_FUNCTION);
}

Exception classes

All exceptions used in OpenMS are derived from Exception::Base defined in CONCEPT/Exception.h. A default constructor should not be implemented for these exceptions. Instead, the constructor of all derived exceptions should have the following signature:

AnyException(const char* file, int line, const char* function[, ...]);

Additional arguments are possible but should provide default values (see IndexOverflow for an example).

How to expose classes and methods to python

C++ classes and their methods can be exposed to python via pyOpenMS. If you are interested in exposing your algorithms to python, view the pyopenms documentation for the coding conventions and examples.

Documentation

UML diagrams

To generate UML diagrams, use yEd and export the diagrams in PNG format. Do not forget to save also the corresponding .yed file.

Doxygen

Each OpenMS class has to be documented using Doxygen. The documentation is inserted in Doxygen format in the header file where the class is defined. Documentation includes the description of the class, each method, type declaration, enum declaration, each constant, and member variable.

Longer pieces of documentation start with a @brief description, followed by an empty line and a detailed description. The empty line is needed to separate the brief from the detailed description.

Descriptions of classes always have a brief section.

Use the doxygen style of the following example for OpenMS:

class Test
{
public:
enum class EnumType
{
EVAL1,
EVAL2
};
Test();
// note: use either Rule-of-0 or Rule-of-6!
int dummy(int dummy_a, const char* dummy_s);
int isDummy();
void dummy2();
void dummy3();
protected:
int value_;
};

The defgroup command indicates that a comment block contains documentation for a group of classes, files or namespaces. This can be used to categorize classes, files or namespaces, and document those categories. You can also use groups as members of other groups, thus building a hierarchy of groups. By using the ingroup command, a comment block of a class, file or namespace will be added to the group or groups.

The groups (or modules as doxygen calls them) defined by the ingroup command should contain only the classes of special interest to the OpenMS user. Helper classes and such must be omitted.

Documentation that does not belong to a specific .cpp or .h file can be written into a separate Doxygen file (with the ending ".doxygen"). This file will also be parsed by Doxygen.

Open tasks are noted in the documentation of a header or a group using the @todo command. The ToDo list is then shown in the doxygen menu under 'Related pages'. Each ToDo should be followed by a name in parentheses to indicated who is going to handle it.

You can also use these commands:

@todo Necessary todo for the the next release. Should be done as soon as possible.
Please add the name of the responsible person in parentheses!
@improvement Possible improvement, but not really necessary.
Please add the name of the responsible person in parentheses!
@deprecated Deprecated class, that must be removed in the next release.
@experimental Experimental class, that will perhaps not make it to the library.
@bug Description of a bug in the class/method.
Please add the name of the finder in parentheses!
Doxygen is not hard to learn, have a look at the manual :-)

Add comments to code

The code for each .cpp file has to be commented. Each piece of code in OpenMS has to contain at least 5% of comments. The use of:

// Comment text

instead of:

/* Comment text */

is recommended to avoid problems arising from nested comments. Comments should be written in plain english and describe the functionality of the next few lines.

Examples

Instructive programming examples are provided in the doc/code_examples directory. See OpenMS Developer Guide.

Testing

View the How To Write Tests guidelines to learn how to write tests.

Revision control

OpenMS uses git to manage different versions of the source files. For easier identification of the responsible person each OpenMS file contains the $Maintainer:$ string in the preamble.

Examples of .h and .cpp files have been given above. In non-C++ files (CMake files, (La)TeX-Files, etc.) the C++ comments are replaced by the respective comment characters (e.g. `‘#’' for CMake files, % for (La)TeX).