TCGA benchmarking Docker declarations

OpenEBench TCGA benchmarking Docker declarations, which define the architecture of benchmarking workflows to be implemented in OpenEBench.

NOTE for developers. In order to make the workflow containers reproducible and stable in the long-term, make sure to use specific versions in the container base image (e.g.ubuntu:16.04, NOT ubuntu:latest).

Structure

Our benchmarking workflow structure is composed by three docker images / steps:

Validation: the input file format is checked and, if required, the content of the file is validated. The validation generates a participant dataset. In order to create datasets with structure compatible with the Elixir Benchmarking Data Model, please use the following python module and JSON schema
Metrics_computation: the predictions are compared with the 'Gold Standards' provided by the community, which, in this case, results in two performance metrics - precision (Positive Predictive Value) and recall(True Positive Rate). Those metrics are written into assessment datasets. In order to create datasets with structure compatible with the Elixir Benchmarking Data Model, please use the following python module and JSON schema
Consolidation: the benchmark itself is performed by merging the assessment metrics with the rest of TCGA data. The results are provided SVG format - scatter plot, and JSON format - aggregation/summary datasets, which are also compatible with the Elixir Benchmarking Data Model.

Find more information about the TCGA Cancer Drivers Pipeline here.

TCGA sample files

example_input.txt is a gene predictions file which can be used as input to test the containers.
TCGA_full_data folder contains all the reference data required by the containers. It is derived from the manuscript: Comprehensive Characterization of Cancer Driver Genes and Mutations, Bailey et al, 2018, Cell
sample_results folder contains an example output for each of the containers run, separated in subfolders, with two cancer types / challenges selected (ACC, BLCA). Results found in sample_results/consolidation_results can be visualized in the browser using this javascript library.

Usage

In order to build the Docker images locally, please run ./build.sh 1.0.3

inab/TCGA_benchmarking_dockers

TCGA benchmarking Docker declarations

Structure

TCGA sample files

Usage