ReMap is a project which goal is to provide a catalogue of high-quality regulatory regions resulting from a large-scale integrative analysis of hundreds of transcription factors and general components of the transcriptional machinery from DNA-binding experiments. This git contain all the workflow necessary to run an analysis ReMap style from sample annotation files to the a final BED catalogue.
- Python 3
- Snakemake >= 5.5.1
- Recommended : Conda/Docker/Singularity
- Pull the git
- That's all
Please follow the wiki page for step by step inforamtions.
- Prepare the metadata from your annotation file by extracting downloading info (See README.md in 1.metadata/)
- Create cluster config and snakemake config to match your set up (See example in 2.scripts/cluster_configuration)
- Get necessary files such as reference genome (See README.md in 3.genome/)
- Create a launch bash script (See example in root directory)
- Run launch script
The worklow (see wiki) will create the necessary folders for most of the main processing steps.
Folder in bold will be created by the workflow.
- 1.metadata/
- 2.scripts/
- 3.genome/
- 4.preprocessing/
- log/
- raw_bam/
- raw_fastq/
- rm_mismatch_bam/
- sam/
- sort_bam/
- trim_fastq/
- 5.bam/
- 6.peakcalling/
- 8.quality/
- 9.bed/
remap-pipeline can be run with Conda or Docker/Singularity and Torque and Slurm.
Examples are in launch scipts.
- Jeanne Cheneby - jCHENEBY
- Fayrouz Hammal - fayrouzhammal
- Lionel Spinelli - lionel-spinelli
- Benoit Ballester - benoitballester
Under GNU GPLv3 licence.
ReMap 2020: a database of regulatory regions from an integrative analysis of Human and Arabidopsis DNA-binding sequencing experiments Jeanne Chèneby, Zacharie Ménétrier, Martin Mestdagh, Thomas Rosnet, Allyssa Douida, Wassim Rhalloussi, Aurélie Bergon, Fabrice Lopez, Benoit Ballester. Nucleic Acids Research, 29 October 2019, https://doi.org/10.1093/nar/gkz945