Pipelines
Documentation is in /doc subfolder, managed by sphinx.
This repository contains pipelines for processing NGS data and associated scripts used by them (in the /pipelines/tools subdirectory). It also has several accompanying scripts that use the same infrastructure to do other processing for projects.
Pipelines here are configured to work with looper
and use pypiper
(see the corresponding repositories).
Installing
pip install https://github.com/epigen/looper/zipball/master
pip install https://github.com/epigen/pypiper/zipball/master
- Clone this repository:
git clone git@github.com:epigen/open_pipelines.git
- Produce a config file (it just has a bunch of paths).
- Go!
If you are just using a pipeline in a project, and you are not developing the pipeline, you should treat this cloned repo as read-only, frozen code, which should reside in a shared project workspace. There should be only one clone for the project, to avoid running data under changing pipeline versions (you should not pull any pipeline updates unless you plan to re-run the whole thing).
Running pipelines
We use Looper
to run pipelines. This just requires a yaml format config file passed as an argument, which contains all the settings required.
This can, for example, submit each job to SLURM (or SGE, or run them locally).
looper run metadata/config.yaml
Running on test data
Small example data for several pipeline types is available in the microtest repository
Post-pipeline processing
Once a pipeline has been run (or is running), you can do some post-processing on the results.
Looper
has a command to do this: looper summarize
, which collects statistics produced by the pipelines for all submitted samples.
Developing pipelines
If you plan to create a new pipeline or develop existing pipelines, consider cloning this repo to your personal space, where you do the development. Push changes from there. Use this personal repo to run any tests or whatever, but consider making sure a project is run from a different (frozen) clone, to ensure uniform results.