/hictools

Tools for processing HIC data.

Primary LanguagePythonMIT LicenseMIT

hictools

hictools is a collection of handful tools for processing hic data based on:

  • nextflow 1 pairtools 2 cooler 3 higlass 4 ......

Currently, hictools is mainly composed of 3 parts:

  • A nextflow based hic-data processing pipeline.
  • Python api for handling outputs of the pipeline(.cool).
  • A simple yet user friendly higlass server.

requirements

Install

pip install hictoools

Usage

Pipeline

  • Fill in the config file and run.

Fill in a config file in order to specify the desired sample files(.fastq) to process. You can find a template config file in the current folder named config_template.yml after you executed this command:

nextflow pull zhqu1148980644/hictools && nextflow run zhqu1148980644/hictools
  • Run the pipeline with this config file.
nextflow run -params-file config_template.yml hictools -resume \
-profile standard
# Choose the cluster profile if you wish to run in a another executor(platform) by using:
# -profile cluster
  • Outputs

You can find output folders after the execution of pipeline finished in the same directory as the current config file.

work/         # Working directory generated by nextflow.
log/          # Logging file generated by nextflow.
results/      # Main folder containing results of this pipeline.
    fastqc/ bams/ pairs/ cools/ features/ other/

Visualize

  • Start a api server to provide tilesets

Files added into this folder will be automatically recorded and converted using clodius.

e.g. .mcool .bam .bigwig .bed ....

hictools hgserver serve --workers 10 --paths ./
>>
Openning api server: http://x.x.x.x:48005/api/v1
Tilesets Database: sqlite:////store/qzhong/.hictools_hgserver.db
Run 'hictools hgserver view --api_port 48005 to visualize in your web browser.
  • Start higlass web app and visit in browser
hictools hgserver view --api_port 48405
>>
Go visit http://x.x.x.x:8888 in browser.

CLI


API

Check source codes for details.


Notes

nextflow is a pipeline framework based on the dataflow programing model, which can be spectacularly expressive when writing complex distributed pipelines.

Pipeline built by Nextflow can be executed in multi platforms including SGE, LSF, SLURM, PBS, HTCondor batch schedulers , Kubernetes and Amazon AWS cloud platform by changing the executor specified in ~/.nextflow/assets/zhqu1148980644/hictools/nextflow.config. The default executor is SGE(Sun Grid Engine). You may need to change it depending on the platform you use.

The pipeline procesures in hictools are similar to that of the Hi-C Processing Pipeline used by 4DN.

Reference

  • nextflow A DSL for data-driven computational pipelines.

  • pairtools A cool place to store your Hi-C.

  • cooler CLI tools to process mapped Hi-C data.

  • higlass Fast large scale matrix visualization for the web.