JuRSE

Verification & Validation

Developing code often is an iterative process. After a piece of code (e.g. a function) has been written you might want to check if it works as you expect it to work. As we are all humans and all of us make mistakes it is helpful to check correctness and not just assume it will work fine. If this check exhibits an issue, you can tackle it; if the check runs fine you can start with another piece of code and the process starts again (developing code, check for correctness, resolve possible issues).

Additionally, in the scientific context we want our code to create reproducible results. If your code produces different results on each run it is difficult to build your scientific analysis on these results. This emphasises the importance of checking your code for correctness.

Tools and standards for Software development
When creating software, it is recommended to document which tools and guidelines are used. This can include style guides and tools for checking this style. These tools are often called linter (for checking code for common errors and bad styles) and code-formatter (tools that reformat your code to comply with style guidelines.

The style guidelines are different for every language. For Python PEP8 and other PEP conventions are commonly used. For C++ the isocpp might be a good point to start.

Furthermore, linter are not only available for programming languages, but also for many other formats, such as markdown and yaml.

For python an often used code formatter is black.

Functional test, test strategy
Functionally testing your software means that you should check if the software does what you expect it to do. Often tests are divided into different categories dependent on the scope of their test. The smallest are called Unit-Tests. These test only a single unit (often a single function). These tests are often easy to write and if they fail it amount of code which could lead to the issue is limited. Therefore, it is recommended to mainly use those tests.

Other scopes of tests are Integration Tests (testing the interaction between multiple units) and End-To-End Tests (testing the whole system). These are often harder to create properly and if they exhibit errors the reason for the error is more difficult to find. Nevertheless those kinds of tests can help you to check whether your software works as expected.

Documenting when, what, and how to test is part of the requirement cited above. This means that the developers and other people involved in the development of the software (e.g. supervisors) should agree on what needs to be tested in which way and in which frequency (e.g. on every commit, on every release, weekly, ...)

Carrying out tests systematically does not mean that you need to perform them automatically. Although automating the tests can be helpful in the long run, it is not required here. Carrying them out systematically also includes documenting on how to conduct those tests specifically. This is especially important to document in depth if testing is done manually.

Popular tools for testing are pytest (for python), and google-tests (for C++).

Useful Tip: There is also a series of 6 seminars all about testing on the HiRSE YouTube Channel.

Coverage
Since testing the code is important and required (c.f. text from the guidelines below at Class 3) it is relevant to know which parts of the code are actually run when conducting the tests. This is where tools for code coverage come into play and help you. They analyse what lines of your code are executed as your tests run and create a so-called coverage-report for you. This report not only gives you the fraction of how many lines of code are executed (covered) when running all tests, but also shows you which lines are missed. This helps you to add more tests specifically targeted at those lines.

It is recommended to test as much of your code as possible, because in every untested line of code errors could be hidden that you don't know about.

Tools for creating coverage reports are dependent on the programming language that you use. Often they are coupled with the tools for testing.

Test Automation

Automating tests is not required for application class 2 (but personally recommended), but for application it now is mandatory. Automating the tests is important to ensure that they run systematically (on every commit and periodically). This helps to find errors early, as the bug was introduced between the last successful test and the failing one. Therefore, fewer commits (ideally only a single one) need to be checked. Furthermore, automating the tests means that they are conducted the same, regardless who developed the new piece of code. Although this may be achieved with specific documentation of how to conduct tests, it is (often) easier by automating the tests.

Popular tools for automation of tests are tools for Continuous Integration (CI), such as Gitlab-CI, GitHub Actions, or Jenkins.

What the Software Guidelines say about Verification and Validation

Class 1 and higher
The tools and standards for software development, validation, and verification must be agreed upon with the development team (and superiors) and their use must be documented. Wherever possible, software development is carried out according to recognized standards and using state-of-the-art tools during development, validation, verification, and provision.

Example: Tools like SoftWipe and publications such as "Best Practice for Scientific Computing” (Wilson et al. 2014; PLOS Vol.12 Issue 1).

Class 2 and higher
Functional tests of the software are carried out systematically. For this purpose, an adequate test strategy is defined and documented.

Class 3
A high degree of coverage with test automation is given. Example: Continuous integration or even continuous delivery can be used for test automation.

For example, GitLab CI, GitHub Actions, etc. are services that can be used easily and that are well integrated into web interfaces.

Last Modified: 12.03.2025