This is a fork of the NucleoATAC package authored by Alicia Schep in the Greenleaf Lab, a package for calling nucleosome occupancy from ATAC-seq data. We modify the package to take fragments files as inputs instead of BAM files. This fork is not being officially maintained. Please refer to the original package for the last stable version.
- Create a conda environment
# loading the module first seems to be necessary for properly loading cython
module load python/2.7.13
conda create -n "nucleoatac" python=2.7.13
conda activate nucleoatac
- Install the package
cd NucleoATAC
pip install --editable .
NOTE: to edit the package and install the new version, update the version
number in setup.py
for documentation, then re-run the above pip command.
Test the install with pyatac --version
.
NOTE: nucleoatac run
doesn't work with the modifications for fragment
sizes yet. Run commands individually as per the docs.
- Goal: allow nucleosome calling to be performed with tabix-indexed compressed fragments files in BED format
- The
pyatac.fragmentsizes
module has been updated so that theFragmentSizes
class can now accept a fragments file instead of a BAM file for calculation of the fragment size distribution. This makes use of new functions in thenucleoatac.fragments_handling
module. - The following commands now can now accept a fragments file instead of BAM file:
pyatac sizes
nucleoatac occ
nucleoatac nuc
- The
pyatac.chunkmat2d.FragmentMat2D
module has been updated so that theFragmentMat2D
class can now accept a fragments file instead of a BAM file for calculation of the fragment size distribution, which will fetch the chunk region from the fragments file usingpysam
's tabix interface, and then populate the 2D matrix - Add chunk handling for the fragemnts file as well
- When commands required BAM files for calculation of fragment sizes, the fragment sizes output from previous commands is now required instead
- Goal: calculate the fragment size distribution on the whole dataset, but restrict the occupancy calculation to a subset of chromosomes to allow users to debug or further parallelize the analysis
- The
nucleotac occ
command now accepts a--chroms_keep
argument that allows users to specify a list of chromosomes to restrict the occupancy calculation to, and a chunk list will only be generated for these chromosomes. - This can be used to manually parallelize the runs, or do a quick run for debugging
nucleoatac occ
andpyatac sizes
changed to produce .pdf instead of .eps files- More verbose messaging added in
nucleoatac occ
andnucleoatac nuc
For reference, running only on chr1 (with 7,267,633 fragments) on 4 cores and 64G of memory took about 14 minutes.
Command run: /path/to/.local/bin/nucleoatac occ --fragments ../data/Eye_c11__sorted.tsv.gz --bed ../data/Eye_c11__peaks_overlap_filtered.narrowPeak --fasta
/path/to/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta --out ../out/04-Eye_c11__chr1 --cores 4 --chroms_keep chr
1
nucleoatac version 0.3.4
start run at: 2024-08-27 14:48
---------Computing Occupancy and Nucleosomal Insert Distribution----------------
@ NOTE: restricting analysis to chromosomes: chr1
@ calculating fragment sizes...
@ fitting fragment size distribution...
@ calculating occupancy...
@ compressing and indexing output files...
@ making figure
@ done.
end run at: 2024-08-27 15:02
This package is no longer being actively maintained; feel free to post issues that others in the community may respond to, but this package will likely not be updated further. Additionally, if anyone wants to maintain a fork of the package or has developed an alternative package for similar purposes, would be happy to link to that repo here.
Python package for calling nucleosomes using ATAC-seq data. Also includes general scripts for working with paired-end ATAC-seq data (or potentially other paired-end data).
Please cite our paper at Genome Research if you use this tool in your research.
Please use GitHub Issues to bring up any errors that occur with software rather than emailing authors.
Note on Versions:
- version 0 represents code used for biorxiv manuscript
- version 0.2.1 was used for Genome Research manuscript (See Supplemental Information as well)
Documentation can be found at http://nucleoatac.readthedocs.org/en/latest/
If you want to easily read in NucleoATAC outputs into R for further processing or exploration, check out NucleoATACR
Currently NucleoATAC only supports Python 2.7 (No Python 3).