SeATAC is an R package to estimate the genomic regions with statistically differential chromatin accessibility from multiple ATAC-seq data. Using SeATAC, each genomic region is represented as a V-plot, a dot-plot showing how sequencing reads with different fragment sizes distribute surrounding one or a set of genomic region(s). For a more detailed overview of the method, please see the manuscript. The notebooks and scripts to reproduce the results in the manuscript can be found here.
SeATAC uses a conditional variational autoencoder (CVAE) model to learn the latent representation of the ATAC-seq V-plot. With the probabilistic representation of the data, we developed a Bayesian method to evaluate the statistical difference between multiple V-plots.
SeATAC has significantly better performance on four separate tasks compared to MACS2 and/or NucleoATAC on both synthetic and real ATAC-seq datasets, including (1) detection of differential V-plots; (2) definition of nucleosome positions; (3) detection of nucleosome changes and (4) designation of transcriptional factor binding sites (TFBS) with differential chromatin accessibility.