/ArchRtoSignac

An R package performing object conversion from ArchRProject (ArchR) to Signac SeuratObject (Signac)

Primary LanguageRMIT LicenseMIT

ArchRtoSignac : an Object Conversion Package for ArchR to Signac

DOI

ArchRtoSignac is an R package to convert an ArchRProject (ArchR) to a Signac SeuratObject (Signac).

ArchR and Signac are both commonly used scATAC-seq analysis packages with comparable sets of features and are currently under development, which means they are likely to change over time. You can choose to use only one of these packages; however, you may want to use both packages for your analysis. For example, we use ArchR to generate a fixed-width peak matrix due to its computational advantage, and we use Signac for reference mapping to assist in cell-type annotation. Here we provide an option to help with the data formatting from an ArchRProject to a Signac SeuratObject: ArchRtoSignac, a wrapper function that allows easier implementation of both pipelines. In addition, conversion to a SeuratObject allows the use of other packages available through SeuratWrappers.


How to cite

Shi, Zechuan; Das, Sudeshna; Morabito, Samuel; Miyoshi, Emily; Swarup, Vivek. (2022). Protocol for single-nucleus ATAC sequencing and bioinformatic analysis in frozen human brain tissue, STAR Protocols, Volume 3, Issue 3, DOI: https://doi.org/10.1016/j.xpro.2022.101491.


Installation

We recommend creating an R conda environment specifically for scATAC-seq analysis to install the required packages. This ensures that software versions required here do not conflict with software required for other projects, and several dependencies for ArchRtoSignac will be automatically installed.

# create new conda environment for R
conda create -n scATAC -c conda-forge r-base r-essentials

# activate conda environment
conda activate scATAC

Next, open up R and install ArchRtoSignac using devtools.

# install devtools if not already installed
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")

# install ArchRtoSignac
devtools::install_github("swaruplabUCI/ArchRtoSignac")
# load ArchRtoSignac
library(ArchRtoSignac)

When installing ArchRtoSignac, the following required dependencies should be automatically installed.

  • ArchR, a general-purpose toolkit for single-cell ATAC sequencing analysis.
  • Seurat, a general-purpose toolkit for single-cell RNA sequencing analysis.
  • Signac, a general-purpose toolkit for single-cell ATAC sequencing analysis.
  • devtools, a package for package development in R.
  • biovizBase, basic graphic utilities for visualization of genomic data in R.
  • stringr, a package for data cleaning and preparation in R.

However, if there are issues with installation, please try the following:

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")

# install Bioconductor core packages
BiocManager::install()

# install additional packages including ArchR, Signac Seurat and etc:
if (!requireNamespace("biovizBase", quietly = TRUE)) BiocManager::install("biovizBase")
if (!requireNamespace("ArchR", quietly = TRUE)) devtools::install_github("GreenleafLab/ArchR", ref="master", repos = BiocManager::repositories())
if (!requireNamespace("Signac", quietly = TRUE)) install.packages("Signac")
if (!requireNamespace("Seurat", quietly = TRUE)) install.packages("Seurat")
if (!requireNamespace("stringr", quietly = TRUE)) install.packages("stringr")

Usage

  • STEP 0 - Check all required dependencies have been installed and load them automatically.
packages <- c("ArchR","Seurat", "Signac","stringr") # required packages
loadinglibrary(packages)
  • STEP 1 - Obtain ArchRProject peak matrix for object conversion.
pkm <- getPeakMatrix(proj) # proj is an ArchRProject
  • STEP 2 - Extract appropriate Ensembl gene annotation and convert to UCSC style.
library(EnsDb.Hsapiens.v86) # Ensembl database to convert to human hg38. Install what is appropriate for your analysis

annotations <- getAnnotation(reference = EnsDb.Hsapiens.v86, refversion = "hg38") # "UCSC" is the default style to change to but can be changed with argument seqStyle
  • STEP 3 - Convert ArchRProject to Signac SeuratObject.

Option1: Fragments Files using for fragments_fromcellranger from 10X Genomics Cellranger ATAC output

Please select Yes for fragments_fromcellranger. Example fragments_fromcellranger = "Yes"

fragments_dir <- "path_to_cellranger_atac_output" # the directory before "/outs/" for all samples

seurat_atac <- ArchR2Signac(
  ArchRProject = proj,
  refversion = "hg38",
  #samples = samplelist, # list of samples in the ArchRProject (default will use ArchRProject@cellColData$Sample but another list can be provided)
  fragments_dir = fragments_dir,
  pm = pkm, # peak matrix from getPeakMatrix()
  fragments_fromcellranger = "Yes", # fragments_fromcellranger This is an Yes or No selection ("NO" | "N" | "No" or "YES" | "Y" | "Yes")
  fragments_file_extension = NULL, # Default - NULL: File_Extension for fragments files (typically they should be '.tsv.gz' or '.fragments.tsv.gz')
  annotation = annotations # annotation from getAnnotation()
)

Option2: Fragments Files using for fragments_fromcellranger from NON Cellranger ATAC output, ie: SnapATAC tools

Please select No for fragments_fromcellranger. Example fragments_fromcellranger = "NO", Also remember to provide the fragments_file_extension, for example fragments_fromcellranger = '.tsv.gz' or fragments_fromcellranger = '.fragments.tsv.gz'.

fragments_dir <- "/ArchR/HemeFragments/" # please see the fragments format provided by ArchR examples

Above is the directory accessing the fragments files.

For eample, Fragments files in the folder HemeFragments, which we can check them in terminal

tree /ArchR/HemeFragments/

/ArchR/HemeFragments/
├── scATAC_BMMC_R1.fragments.tsv.gz
├── scATAC_BMMC_R1.fragments.tsv.gz.tbi
├── scATAC_CD34_BMMC_R1.fragments.tsv.gz
├── scATAC_CD34_BMMC_R1.fragments.tsv.gz.tbi
├── scATAC_PBMC_R1.fragments.tsv.gz
└── scATAC_PBMC_R1.fragments.tsv.gz.tbi

Now back in R

## NOTE: steps before the the conversion from ArchRProject to Signac SeuratObject.

#BiocManager::install("EnsDb.Hsapiens.v75")
#library(EnsDb.Hsapiens.v75)
#annotations <- getAnnotation(seqStyle = 'UCSC', refversion = 'hg19', reference = EnsDb.Hsapiens.v75)
#pm <- getPeakMatrix(ArchRProject= proj)

# Conversion function
seurat_atac <- ArchR2Signac(
  ArchRProject = proj,
  # samples = samples, # Provide a list of unique sample
  fragments_dir = fragments_dir, # the folder that contains all fragments samples in '.fragments.tsv.gz' or '.tsv.gz'
  pm = pm, # geting peak martix
  fragments_fromcellranger = "NO",
  fragments_file_extension = '.fragments.tsv.gz',
  refversion = 'hg19', # write the EnsDb version
  annotation = annotations
)
  • STEP 4 - Transfer ArchRProject gene score matrix to Signac SeuratObject.
gsm <- getGeneScoreMatrix(ArchRProject = proj, SeuratObject = seurat_atac)

seurat_atac[['RNA']] <- CreateAssayObject(counts = gsm)
  • STEP 5 - Transfer ArchRProject dimension reduction ("IterativeLSI" or "Harmony") and UMAP to Signac SeuratObject.
seurat_atac <- addDimRed(ArchRProject = proj,
			 SeuratObject = seurat_atac,
			 reducedDims = "IterativeLSI"
) # default is "IterativeLSI"
	#add both 'Harmony' and ‘IterativeLSI’:
	#reducedDims = c('IterativeLSI', 'Harmony')