This is the repository for the work of the GA4GH Benchmarking Team, which is developing standardized definitions for genome variant calling performance metrics, as well as reference implementations of tools to calculate these performance metrics.
We also provide links to our reference benchmarking engines and their implementations, as well as to benchmarking datasets.
** Note: This is work in progress. **
See doc/standards/ for the current benchmarking standards and definitions.
A suite of reference implementations following the standards outlined above are available at tools/. These are submodules which link to the original tool repositories.
The benchmarking process contains a variety of steps and inputs. In doc/ref-impl/, we standardise intermediate formats for specifying truth sets, stratification regions, and intermediate outputs from comparison tools.
In resources/, we provide files useful in the benchmarking process. Currently, this includes links to benchmarking calls and datasets as well as standardized bed files describing potentially difficult regions for performance stratification.