This repo hosts the scripts used for tracing the evolution history of ruminants TFBS evolution using the birth-death probabilistic model developed by Yokoyama and Zhang et al. 2014 PLoS CB
The pipeline consists of three steps: data preprocessing, predict the branch-of-origin of TFBS in enhancer regions, and generation the final report. Script 'whole_pipeline.py' is the main script of the pipeline.
-
data preprocessing (run_prep.sh)
- calcualte the nucleotide background frequency inside the enhancer region
- use TFM-Pvalue to get the log-likelihood cutoff for motif scanning
-
predict the branch-of-origin
- scan tfbs in the multiple sequence alignment (e.g., run_evo/run_evo_scan_cattle.sh)
- predict the branch-of-origin of TFBS (e.g run_evo/run_evo_predict_cattle.sh)
-
generation the final report
- summarize the tfbs branch-of-origin in enhancers (e.g., run_evo/run_evo_summary_merge.sh)
This pipeline was developed by Ma group @ Carnegie Mellon University. Part of the script is from ANTICE, a softwaare being developed for predicting the evolution of lineage-specific TFBS. This pipeline and ANTICE are implemented by Yang Zhang.
This software is under MIT license.
yangz6 at cs.cmu.edu