/bioruby-gfastqc

Bioinformatics. Aggregate FastQC (quality control for Next Generation Sequencing -NGS-) results from many different samples in a single web page, with charts and tables organized and simplified. The main goal is to speed up the communication process with colleagues (PIs, Biologists, BioInformaticians).

Primary LanguageRubyMIT LicenseMIT

bio-gfastqc

Build Status

Bioinformatics. Aggregate FastQC (quality control for Next Generation Sequencing -NGS-) results from many different samples in a single web page, with charts and tables organized and simplified. The main goal is to speed up the communication process with colleagues (PIs, Biologists, BioInformaticians).

Note: this software is under active development!

Installation

gem install bio-gfastqc

Usage

require 'bio-gfastqc'

The API doc is online. For more code examples see the test files in the source tree.

Note: at this time there is not a real API, it will follow.

Project home page

Information on the source tree, documentation, examples, issues and how to contribute, see

http://github.com/helios/bioruby-gfastqc

The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.

Create a config YAML file

config.yml

it contents is for example:

samples:
  Sample_A: /path_to_sampleA
  Sample_B: /path_to_sampleB

then run the script in the directory of the config.yml file and specify the sub directory for each sampel where is located the result of the FASTQC

gfastqc -a R1 -b R2

in case you have the results of FastQC in a sub folder and you want to keep the definition of the sample independen from it, you can use the step option

gfastqc -a R1 -b R2 --step qc_pre_trimming

Then open index.html in your browser

Pipengine

Pipengine, https://github.com/fstrozzi/bioruby-pipengine, a simple launcher for complex biological pipelines. Because we are developing it we found usefult to reuse some best practices from it. An example is the -s/--step options which let you select the sample inner directory from which grab the FastQC results. In the current examples we defined just samples and their absolut path, but following the Pipengine directives it is necessary to define another tag:

output: /path/where_the_pipe_engine_data_are_processed_and_saved

to reuse that tag from gfastqc the user can simply select the option

-p/--pipengine

the software will look for the results of single fastqc applied to the different samples in the output directory.

TODO

  • read output tag from sample config file (YAML)
  • add reference to pipengine
  • avoid user to specify -a and -b. By default discover zip files and ordering them define the first and second strand.
  • package everything as a gem
  • provide better documentation for installing the gem on multiple system (GNU/Linux, OSX, Windows)

Cite

If you use this software, please cite one of

Biogems.info

This Biogem is published at (http://biogems.info/index.html#bio-gfastqc)

Copyright

Copyright (c) 2015 Raoul Jean Pierre Bonnal. See LICENSE.txt for further details.