ENCODE TF binding site prediction

The codes for 1.process data 2.train&predict 3.analyze results 4.feature importance 5.plot

background: ENCODE-DREAM

see also: Yuanfang Guan's 1st Place Solution Code

Warning! You need to change the PATH to the data on your own disk before running any of the following codes! (Upgrade: modify the path)

Usage Example (Anchor model):

1. process data

cd TF_exp/data_process/anchor_dnase


cd ../sequence


cd ../gencode


2. train and predict

(1) subsample negative cases

cd TF_exp/data/sample


(2) prepare training & validation data

cd ../train_test/anchor


(3) train models

cd ../../../model/anchor


(4) predict

cd ../../prediction/anchor/


3. analyze

(1) prepare data for evaluation

cd TF_exp/evaluation/target # gold standard

bash bash.sh

cd ../anchor/ # prediction


(2) calcualte AUROCs and AUPRCs

cd ../../analysis/


4. feature importance

cd TF_exp/feature_importance


5. plot

cd TF_exp/analysis/

Rscript plot_fig1.r