🕵️ Code Safety Inspection Service
Prior to the American Society for Microbiology Conference on Rapid Applied Microbial Next-Generation Sequencing and Bioinformatic Pipelines held in December 2020, we organized a collaborative three-day “Hackathon” and brought together more than twenty researchers in the field of microbial bioinformatics from five continents. The goal of our Hackathon was to explore how to employ software testing in microbial bioinformatics.
The ASM NGS 2020 Hackathon aimed to promote the uptake of testing practices and engage the community in its adoption for public health. This repository is an open-source project that gathers guidance, guidelines and examples for software testing for microbial bioinformatics researchers.
Computational algorithms have become an essential component of research, with great efforts of the scientific community to raise standards on development and distribution of code. Despite these efforts, sustainability and reproducibility are major issues since continued validation through software testing is still not a widely-adopted practice.
Based on the experiences from our Hackathon, we developed a set of seven recommendations for researchers seeking to improve the quality and reproducibility of their analyses through software testing. We propose collaborative software testing as an opportunity to continuously engage software users, developers, and students to unify scientific work across domains.
In the field of microbial bioinformatics, good software engineering practices are not widely adopted (yet). Many microbial bioinformaticians start out as (micro)biologists and subsequently learn how to code. Without abundant formal training, a lot of education about good software engineering practices comes down to an exchange of information within the microbial bioinformatics community. That is also where we aim to position our repository: as a resource that could help microbial bioinformaticians get started with software testing if they have not had formal training.
As automated software testing remains underused in scientific software, our set of recommendations not only ensures appropriate effort can be invested into producing a high quality and robust software, but also increases engagement in its sustainability.
Here we propose seven recommendations that should be followed during software development.
Manually testing the functionality of a tool is feasible in early development, but can become laborious as software matures. We recommend:
- Developers establish software needs and testing goals during planning and designing stages to ensure an efficient testing structure;
- A minimal test set should address the validation of core components or the program as a whole (Blackbox testing) and gradually progress toward verification of key functions which can accommodate code changes over time (Whitebox testing).
The following table provides an overview of testing methodologies and can serve as a guide to developers that aim to implement testing practices.
Name | Description | Example |
Installation testing: can the software be invoked on different setups? | ||
Installation testing | Can the software be installed on different platforms? | Test whether Software X can be installed using apt-get, pip, conda and from source. |
Configuration testing | With which dependencies can the software be used? | Test whether Software X can be used with different versions of BLAST+. |
Implementation testing | Do different implementations work similarly enough? | Test whether Software X works the same between the standalone and webserver versions. |
Compatibility testing | Are newer versions compatible with previous input/output? | Test whether Software X can be used with older versions of the UniProtKB database. |
Static testing | Is the source code syntactically correct? | Check whether all opening braces have corresponding closing braces or whether code is indented correctly in Software X. |
Standard functionality testing: does the software do what it should in daily use? | ||
Use case testing | Can the software do what it is supposed to do regularly? | Test whether Software X can annotate a small plasmid. |
Workflow testing | Can the software succesfully traverse each path in the analysis? | Test whether Software X works in different modes (using fast mode, using rnammer over barrnap or using rfam mode). |
Sanity testing | Can the software be invoked without errors? | Test whether Software X works correctly without flags, or when checking dependencies or displaying help info. |
Destructive testing: what makes the software fail? | ||
Mutation testing | How do the current tests handle harmful alterations to the software? | Test whether changing a single addition to a subtraction within Software X causes the test suite to fail. |
Load testing | At what input size does the software fail? | Test whether Software X can annotate a small plasmid (10 Kbp), a medium-size genome (2 Mbp) or an unrealistically large genome for a prokaryote (1 Gbp). |
Fault injection | Does the software fail if faults are introduced and how is this handled? | Test whether Software X fails if nonsense functions are introduced in the gene calling code. |
When testing, it is important to include test files with known expected outcomes for a successful run. However, it is equally important to include files on which the tool is expected to fail. For example, some tools should recognize and report an empty input file or a wrong input format. Examples of valid and invalid file formats are available through the BioJulia project.
Understanding the test workflow is necessary not only to ensure continued software development but also the integrity of the project for developers and users. This can be done through the adoption of a standardized and easy-to-follow format, such as YAML.
Additionally, testing packages or frameworks offer an efficient approach to test creation and design.
Frameworks such as unittest or pytest for Python improve test efficiency, help bug detection and reduce manual intervention.
When possible frameworks should be integrated into test workflows.
When designing tests for your software, plan to automate. Whether your tests are small or comprehensive, automatic triggering of tests will help reduce your workload.
Many platforms trigger tests automatically based on a set of user-defined conditions. Platforms such as GitHub Actions, GitLab CI, CircleCI, Travis CI or Jenkins offer straightforward automated testing of code seamlessly upon deployment.
The result of an automated test in the context of one computational workspace does not ensure the same result will be obtained in a different setup. Although package managers and containers have reduced variability between workspaces, it is still important to ensure your software can be installed and used across supported platforms. One way to ensure this is to test on different environments, with varying dependency versions (e.g., multiple Python versions, instead of only the most recent one).
For prospective users, it is good to know whether you have tested your software and, if so, which tests you have included. This can be done by displaying a badge in your Github README, or linking to your defined testing strategy e.g. a Github Actions YAML, (see recommendation #2).
Documenting the testing goal and process enables end-users to easily check tool functionality and the level of testing.
We recommend contacting the authors, directly or through issues in the code repository, whose software you have tested to share successful outcomes or if you encountered abnormal behavior or component failures. An external perspective can be useful to find bugs that the authors are unaware of.
Software testing can be crowd-sourced, as showcased by the ASM NGS 2020 Hackathon. Software suites such as Pangolin and chewBBACA have implemented automated testing developed during the Hackathon.
For developers, crowd-sourcing offers the benefits of fresh eyes on your software. Feedback and contributions from users can expedite the implementation of software testing practices. It also contributes to software sustainability by creating community buy-in, which ultimately helps the software maintainers keep pace with dependency changes, and identify current user needs.
Software | Badge with link to CI | Version badge | Yaml |
---|---|---|---|
This repo | CSIS.yml | ||
Bactopia | all-bactopia-tests.yml | ||
chewBBACA | chewbbaca.yml | ||
Pangolin | pangolin.yml |
Software | Badge with link to CI | Version badge | Yaml |
---|---|---|---|
Genotyphi | genotyphi.yml | ||
Kraken | kraken.yml | ||
KrakenUniq | krakenuniq.yml | ||
Kraken2 | kraken2.yml | ||
Centrifuge | centrifuge.yml | ||
Prokka | prokka.yml | ||
Quast | quast.yml | ||
SKESA | skesa.yml | ||
Shovill | shovill.yml | ||
BUSCO | busco.yml | ||
Unicycler | unicycler.yml | ||
Trycycler | trycycler.yml | ||
CheckM | checkm.yml | ||
iVar | ivar.yml |
CSIS is a play on the acronym for the United States Food Safety Inspection Service. Additionally, it has CSI in the acronym (Crime Scene Investigation) which has a sort of detective feel to it.
The following participants were responsible for compiling the set of recommendations presented in this repository: Boas van der Putten Inês Mendes, Brook Talbot, Jolinda de Korne-Elenbaas, Rafael Mamede, Pedro Vila-Cerqueira, Luis Pedro Coelho, Christopher A. Gulvik, Lee S. Katz.
The following participants were contributed in automating tests for bioinformatics and contributing a community resource for identifying software that can pass unit tests, available in this repository: Áine O'Toole, Justin Payne, Mário Ramirez, Peter van Heusden, Robert A. Petit III, Verity Hill, Yvette Unoarumhi.