Software testing in microbial bioinformatics: a call to action

🕵️ Code Safety Inspection Service

The CSIS

Prior to the American Society for Microbiology Conference on Rapid Applied Microbial Next-Generation Sequencing and Bioinformatic Pipelines held in December 2020, we organized a collaborative three-day “Hackathon” and brought together more than twenty researchers in the field of microbial bioinformatics from five continents. The goal of our Hackathon was to explore how to employ software testing in microbial bioinformatics.

The ASM NGS 2020 Hackathon aimed to promote the uptake of testing practices and engage the community in its adoption for public health. This repository is an open-source project that gathers guidance, guidelines and examples for software testing for microbial bioinformatics researchers.

Why Software Testing

Computational algorithms have become an essential component of research, with great efforts of the scientific community to raise standards on development and distribution of code. Despite these efforts, sustainability and reproducibility are major issues since continued validation through software testing is still not a widely-adopted practice.

Based on the experiences from our Hackathon, we developed a set of seven recommendations for researchers seeking to improve the quality and reproducibility of their analyses through software testing. We propose collaborative software testing as an opportunity to continuously engage software users, developers, and students to unify scientific work across domains.

Our Aim

In the field of microbial bioinformatics, good software engineering practices are not widely adopted (yet). Many microbial bioinformaticians start out as (micro)biologists and subsequently learn how to code. Without abundant formal training, a lot of education about good software engineering practices comes down to an exchange of information within the microbial bioinformatics community. That is also where we aim to position our repository: as a resource that could help microbial bioinformaticians get started with software testing if they have not had formal training.

Our Recommendations

As automated software testing remains underused in scientific software, our set of recommendations not only ensures appropriate effort can be invested into producing a high quality and robust software, but also increases engagement in its sustainability.

Here we propose seven recommendations that should be followed during software development.

1. Establish software needs and testing goals

Manually testing the functionality of a tool is feasible in early development, but can become laborious as software matures. We recommend:

Developers establish software needs and testing goals during planning and designing stages to ensure an efficient testing structure;
A minimal test set should address the validation of core components or the program as a whole (Blackbox testing) and gradually progress toward verification of key functions which can accommodate code changes over time (Whitebox testing).

The following table provides an overview of testing methodologies and can serve as a guide to developers that aim to implement testing practices.

Table 1: Overview of testing approaches

Name	Description	Example
Installation testing: can the software be invoked on different setups?
Installation testing	Can the software be installed on different platforms?	Test whether Software X can be installed using apt-get, pip, conda and from source.
Configuration testing	With which dependencies can the software be used?	Test whether Software X can be used with different versions of BLAST+.
Implementation testing	Do different implementations work similarly enough?	Test whether Software X works the same between the standalone and webserver versions.
Compatibility testing	Are newer versions compatible with previous input/output?	Test whether Software X can be used with older versions of the UniProtKB database.
Static testing	Is the source code syntactically correct?	Check whether all opening braces have corresponding closing braces or whether code is indented correctly in Software X.
Standard functionality testing: does the software do what it should in daily use?
Use case testing	Can the software do what it is supposed to do regularly?	Test whether Software X can annotate a small plasmid.
Workflow testing	Can the software succesfully traverse each path in the analysis?	Test whether Software X works in different modes (using fast mode, using rnammer over barrnap or using rfam mode).
Sanity testing	Can the software be invoked without errors?	Test whether Software X works correctly without flags, or when checking dependencies or displaying help info.
Destructive testing: what makes the software fail?
Mutation testing	How do the current tests handle harmful alterations to the software?	Test whether changing a single addition to a subtraction within Software X causes the test suite to fail.
Load testing	At what input size does the software fail?	Test whether Software X can annotate a small plasmid (10 Kbp), a medium-size genome (2 Mbp) or an unrealistically large genome for a prokaryote (1 Gbp).
Fault injection	Does the software fail if faults are introduced and how is this handled?	Test whether Software X fails if nonsense functions are introduced in the gene calling code.

2. Input test files: the good, the bad, and the ugly

When testing, it is important to include test files with known expected outcomes for a successful run. However, it is equally important to include files on which the tool is expected to fail. For example, some tools should recognize and report an empty input file or a wrong input format. Examples of valid and invalid file formats are available through the BioJulia project.

3. Use an easy-to-follow language format to implement testing

Understanding the test workflow is necessary not only to ensure continued software development but also the integrity of the project for developers and users. This can be done through the adoption of a standardized and easy-to-follow format, such as YAML.

Additionally, testing packages or frameworks offer an efficient approach to test creation and design. Frameworks such as unittest or pytest for Python improve test efficiency, help bug detection and reduce manual intervention.
When possible frameworks should be integrated into test workflows.

4. Testing is good, automated testing is better

When designing tests for your software, plan to automate. Whether your tests are small or comprehensive, automatic triggering of tests will help reduce your workload.

Many platforms trigger tests automatically based on a set of user-defined conditions. Platforms such as GitHub Actions, GitLab CI, CircleCI, Travis CI or Jenkins offer straightforward automated testing of code seamlessly upon deployment.

5. Try the test once, then try it again

The result of an automated test in the context of one computational workspace does not ensure the same result will be obtained in a different setup. Although package managers and containers have reduced variability between workspaces, it is still important to ensure your software can be installed and used across supported platforms. One way to ensure this is to test on different environments, with varying dependency versions (e.g., multiple Python versions, instead of only the most recent one).

6. Showcase the tests

For prospective users, it is good to know whether you have tested your software and, if so, which tests you have included. This can be done by displaying a badge in your Github README, or linking to your defined testing strategy e.g. a Github Actions YAML, (see recommendation #2).

Documenting the testing goal and process enables end-users to easily check tool functionality and the level of testing.

We recommend contacting the authors, directly or through issues in the code repository, whose software you have tested to share successful outcomes or if you encountered abnormal behavior or component failures. An external perspective can be useful to find bugs that the authors are unaware of.

7. Encourage others to test your software

Software testing can be crowd-sourced, as showcased by the ASM NGS 2020 Hackathon. Software suites such as Pangolin and chewBBACA have implemented automated testing developed during the Hackathon.

For developers, crowd-sourcing offers the benefits of fresh eyes on your software. Feedback and contributions from users can expedite the implementation of software testing practices. It also contributes to software sustainability by creating community buy-in, which ultimately helps the software maintainers keep pace with dependency changes, and identify current user needs.

Example software testing

Tools with integrated testing

Software	Badge with link to CI	Version badge	Yaml
This repo			CSIS.yml
Bactopia			all-bactopia-tests.yml
chewBBACA			chewbbaca.yml
Pangolin			pangolin.yml

Tools with external testing

Software	Badge with link to CI	Version badge	Yaml
Genotyphi			genotyphi.yml
Kraken			kraken.yml
KrakenUniq			krakenuniq.yml
Kraken2			kraken2.yml
Centrifuge			centrifuge.yml
Prokka			prokka.yml
Quast			quast.yml
SKESA			skesa.yml
Shovill			shovill.yml
BUSCO			busco.yml
Unicycler			unicycler.yml
Trycycler			trycycler.yml
CheckM			checkm.yml
iVar			ivar.yml

Etymology

CSIS is a play on the acronym for the United States Food Safety Inspection Service. Additionally, it has CSI in the acronym (Crime Scene Investigation) which has a sort of detective feel to it.

Contributors

The following participants were responsible for compiling the set of recommendations presented in this repository: Boas van der Putten Inês Mendes, Brook Talbot, Jolinda de Korne-Elenbaas, Rafael Mamede, Pedro Vila-Cerqueira, Luis Pedro Coelho, Christopher A. Gulvik, Lee S. Katz.

The following participants were contributed in automating tests for bioinformatics and contributing a community resource for identifying software that can pass unit tests, available in this repository: Áine O'Toole, Justin Payne, Mário Ramirez, Peter van Heusden, Robert A. Petit III, Verity Hill, Yvette Unoarumhi.

microbinfie-hackathon2020/CSIS