DecoDen uses replicates and multi-histone ChIP-Seq experiments for a target cell type to learn and remove shared biases from fragmentation, PCR amplification and sequence mappability.
The installation of DecoDen is currently offered as a Poetry project while in development. The procedure proposed requires a local installation of git and a C compiler.
We recommend the use of Conda to create a suitable environment, with a command such as conda create -n decoden python>=3.10
. After the activation of the environment (conda activate decoden
), follow these steps:
- Install Poetry
- Clone the repository and install with poetry
# Clone the repository
git clone git@github.com:ntanmayee/DecoDen.git
cd decoden
# Install the external dependencies and DecoDen
conda install pyarrow poetry
conda install samtools zlib
# If there is no C compiler installed include also the following command
conda install c-compiler
poetry install
Running decoden requires two inputs:
- Aligned reads in
.bam
format from ChIP-Seq experiments - Sample annotation file in
.csv
format
To generate a skeleton sample annotation file, run -
decoden create_csv
This will create samples.csv
in your current directory. Edit this file and fill in the columns with appropriate information. There are more details here.
Run the DecoDen pipeline with default parameters -
decoden run -i samples.csv -o output_directory -gs genome-size
The following commands are available in DecoDen. Please click on the links to know more about them.
Command | Description |
---|---|
create_csv |
Create a skeleton sample annotation file |
run |
Run the full DecoDen pipeline to preprocess end denoise BAM/BED files |
preprocess |
Pre-process BAM/BED data to be in the correct format for running DecoDen |
denoise |
Run the denoising step of DecoDen on suitably preprocessed data |
detect |
Detect peaks in the processed DecoDen signals |
There is more helpful information in the wiki.
Please raise an issue if you find bugs or if you have any suggestions for improvement.
This project has received funding from the European Union's Framework Programme for Research and Innovation Horizon 2020 (2014-2020) under the Marie Skłodowska-Curie Grant Agreement No. 813533-MSCA-ITN-2018