Script to reproduce the specificity score analysis in Javierre, Burren, Wilder, Kreuzhuber, Hill et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369-1384 (2016). - forked from Steven-M-Hill
The specificity score analysis described in the paper quantifies the cell type-specificity of each gene’s interactions with active enhancers, or each gene's expression, through calculation of gene specificity scores (see the paper for full details).
The script specificityScoreAnalysis.R
contains:
- a function
specificityScore
that calculates cell type-specificity scores given a vector of values (one value for each cell type); - code to calculate gene specificity scores (based on Promoter Capture Hi-C data and expression data);
- code to reproduce Figures 4B,C,D,E and S4B,C.
gplots
ggplot2
RColorBrewer
gridExtra
The script reads in the following data files:
PCHiC_peak_matrix_cutoff5.txt
Available to download from the Open Science Framework repository associated with the paper.
Direct link to file: https://osf.io/63hh4/.
This file contains the Promoter Capture Hi-C peak matrix consisting of CHiCAGO scores for all interactions that pass a cutoff of >=5 in at least one cell type. For further details of the contents and formatting of this file, see https://osf.io/cn4k8/.
GeneExpressionMatrix.txt
Available to download from the Open Science Framework repository associated with the paper.
Direct link to file: https://osf.io/wpjy8/.
This file contains a matrix of gene expression data used in the study, containing expression quantifications generated with MMSEQ
(Turro et al., Genome Biol 2011).
PIRactivity.Rds
Available to download from this Github repository.
This R data file contains a data frame providing the activity statuses for the promoter interacting region (PIR) of each interaction, in each cell type. These activities are defined on the basis of chromHMM segmentations of BLUEPRINT histone modification ChIP data.
baitAnnotations.Rds
Available to download from this Github repository.
This R data file contains a data frame providing Ensembl Regulatory Build features mapping to each baited promoter fragment.