/specs

BioContainers specifications

Apache License 2.0Apache-2.0

The latest information about BioContainers is available via https://BioContainers.pro

Join the chat at https://gitter.im/BioContainers/biocontainers

Containers

Repository of approved bioinformatics containers

Links:

Web Page : http://biocontainers.pro/

Project Definition : https://github.com/BioContainers/specs

Contribution Rules : https://github.com/BioContainers/specs/blob/master/CONTRIBUTING.md

Containers : https://github.com/BioContainers/containers

Email : biodockers@gmail.com

License

Apache 2

Contents

  1. Essentials
    1.1. What is BioContainers
    1.2. Objectives
  2. Containers
    2.1. What is a container?
    2.2. Why do I need to use a container
    2.3. How to use a BioContainer
    2.4. BioContainers Architecture
    2.4.1 How to Request a Container
    2.4.2 Add a container
    2.4.3 Use a Container
  3. Developing containers
    3.1. How to build BioContainers
    3.2. What do I need to develop?
    3.3. How to create a Docker based BioContainer?
  4. Support
    4.1 Get involved

1. Essentials


1.1. What is BioContainers?

The BioContainers project came from the idea of using the containers-based technologies such as Docker or rkt for bioinformatics software. Having a common and controllable environment for running software could help to deal with some of the current problems during software development and distribution. BioContainers is a community-driven project that provides the infrastructure and basic guidelines to create, manage and distribute bioinformatics containers with a special focus on omics fields such as proteomics, genomics, transcriptomics and metabolomics. The main containers already implemented in BioContainers (https://github.com/BioContainers/containers) are discussed in details including examples on how to use BioContainers. The currently available BioContainers containers facilitate the usage, and reproducibility of software and algorithms. They can be integrated into more comprehensive bioinformatics pipelines and different architectures (local desktop, Cloud environments or HPC clusters). We also present the guidelines and specifications on how to create new containers, and how to contribute to the BioContainers project.

1.2. Objectives and Goals

  • Provide a base specification and images to easily build and deploy new bioinformatics/proteomics software including the source and examples.

  • Provide a series of containers ready to be used by the bioinformatics community (https://github.com/BioContainers/containers).

  • Define a set of guidelines and specifications to build a standardized container that can be used in combination with other containers and bioinformatics tools.

  • Define a complete infrastructure to develop, deploy and test new bioinformatics containers using continuous integration suites such as Travis Continuous Integration (https://travisci. org/), Shippable (https://app.shippable.com/) or manually built solutions.

  • Provide support and help to the bioinformatics community to deploy new containers for researchers that do not have bioinformatics support.

  • Provide guidelines and help on how to create reproducible pipelines by defining, reusing and reporting specific container versions which will consistently produce the exact same result and always be available in the history of the container.

  • Coordinate and integrate developers and bioinformaticians to produce best practice of documentation and software development.

2. Containers


2.1. What is a container?

Containers are build from existing operating systems. They are different from Virtual machines because they don't possess an entire guest OS inside, instead, containers are built using optimized system libraries and use the host OS memory management and process controls. Containers normally are centralized around a specific software and you can make them executable by instantiating images from them.

What is Container

2.2. What do I need to use a container?

Most of the time when a bioinformatics analysis is performed, several bioinformatics tools and software should be installed and configured. This process can take several hours and demands a lot of efforts including the installation of multiple dependencies and tools. BioContainers provides ready to use packages and tools that can be easily deployed and used in local machines, HPC and cloud architectures.

2.3. How to use a BioContainer

BioContainers are listed in two main registries:

  • Docker Hub: Docker based containers that can be used using the docker infrastructure.
  • QUAY Hub: Docker and rkt based containers that can be used using the rkt infrastructure.

A full documentation about how to use BioContainers to perform bioinformatics analysis: please check the Full Documentation

2.4. BioContainers Architecture

BioContainers is a community-driven project that allows bioinformatics to request, build and deploy bioinformatics tools using containers. The following figure present the general BioContainers workflow:

What is Container

The next sections explain in details the presented workflow:

  • (i) How to request a workflow
  • (ii) Use a BioContainer
  • (iii) Developing containers

2.4.1 How to Request a Container

Users can request a container by opening an issue in the [containers repository] (http://github.com/BioContainers/containers/issues) (In the previous workflow this is the first step performed by user henrik). The issue should contain the name of the software, the url of the code or binary to be package and information about the software see BioContainers specification. When the container is deployed and fully functional, the issue will be closed by the developer or the contributor to BioContainers.

2.4.2 Add a container

Fork the repository and add a container. To add a container:

  • create directory name_of_software/version_of_upstream_software
  • in directory add a Dockerfile following BioContainers specification
  • optionally (but recommended) add a test-cmds.txt file to test automatically the container (list of one-line bash commands

Example test-cmds.txt

mytool -v
mytool -h

Test the container recipe where your Dockerfile is:

docker build .

If ok, commit and push your changes to your fork and ask for a pull request to our repository.

2.4.3 Use a BioContainer

When a container is deployed and the developer closes the issue in GitHub, the user (henrik) receives a notification that the container is ready. The user can then use docker or rkt to pull or fetch the corresponding container.

3. Developing containers


3.1. How to build BioContainers

There are two different ways to build a container.

  • Go to the GitHub repository with the recipe of the software you want, clone it, and build it yourself on your machine.
  • Use the docker daemon to search for a ready-to-use version of the containerized software you want.

Inside the central repository there is a list of softwares with docker recipes, there you can find more information about how to work with them.

3.2. What do I need to develop?

BioContainers are based on Linux systems, so you will need a computer with Linux installed, you also will need the docker or rkt daemon and the software you want to containerize.

3.3. How to create a Docker based Biocontainer?

Having all in hands now you need to create a Dockerfile. Dockerfiles are simple recipes to instruct the daemon on how to set an appropriate OS and how to download, manage, install and give access to the software inside.

You can check the Docker documentation for more information.

Once the container is ready you can get in touch with us so we can make the appropriate arrangements to make your container available to everyone in the community by giving an automated build system.

3.3. How to create a rkt based Biocontainer?

Having all in hands now you need to create a rkt. rkt containers are simple recipes to instruct the daemon on how to set an appropriate OS and how to download, manage, install and give access to the software inside.

You can check the rkt documentation for more information.

Once the container is ready you can get in touch with us so we can make the appropriate arrangements to make your container available to everyone in the community by giving an automated build system.

4. Support


4.1. Get involved

Whether you want to make your own software available to others as a container, to just use them on your pipelines and analysis or just give opinions, you are most welcome. This is a community-driven project, that means everyone has a voice.

Here are some general ideas:

  • Browse our list of containers
  • Propose your own ideas or software
  • Interact with other if you think there is something missing.