MiraldiLab/maxATAC

Generate Data for Publication Figures

Closed this issue · 0 comments

We have several figures that we want to generate for the v1 maxATAC publication. We are going to use our pool of ENCODE data to show results for models that can be benchmarked.

Figures that need to be finalized:

Figure: Training Data Overview

  • Cumulative training data figure for all available experiments that pass QC from ENCODE + GEO
  • Heatmap of samples that are derived from ENCODE for benchmarking
  • Schematic overview of maxATAC

Figure: Model Performance

  • AUPR by # of training cell types for best models
  • Performance compared to MOODS (motif scanning)
  • Performance compared to the average ChIP-seq signal

Figure: Approach

  • Normalization results
  • Test different random regions ratios: 0,.25, .5, .75, 1
  • Test shuffle cell type KO with best random regions ratio
  • Test reverse complement KO with best random regions ratio
  • Test double KO with best random regions ratio

Figure: maxATAC application to scATAC

  • Application to scATAC-seq from ArchR
  • HighLoading
  • LowLoading
  • Correlation of number of fragments to prediction AUPR
  • Correlation of number of cells to prediction AUPR
  • Correlation of number of cells to delta prediction AUPR
  • Correlation of number of fragments to delta prediction AUPR
  • Correlation of number of cells to log2 prediction AUPR
  • Correlation of number of fragments to log2 prediction AUPR
  • Correlation of median number of fragments per cell to delta prediction AUPR
  • Performance in scATAC-seq data
    • Use multiple cell types in addition to GM12878
  • Performance in scATAC-seq data compared to motifs scanning

Figure: Comparison to ChromVar

  • Performance compared to ChromVar in HBTE cells
  • Umap of data
  • Schematic overview of experimental design

Figure: maxATAC in-situ mutagenesis

  • Schematic overview
  • Example of altered TF binding prediction based on donor sample

Figure: maxATAC model Selection

  • Epoch selection method
  • Model validation on chr2 vs chr1
  • Thresholding for peaks