ADTnorm is a package for CITE-seq batch-integration using landmark registration.
ADTnormPy is a Python wrapper for the ADTnorm R package by Ye Zheng! The preprint for ADTnorm can be found here: Zheng et al. Robust Normalization and Integration of Single-cell Protein Expression across CITE-seq Datasets, bioRxiv, 2022
This package wraps an R package, using rpy2. Therefore, installation of this package requires R is installed in your enviornment version 4.0.0 or greater. To install R into an isolated conda environment and to install ADTnorm in that R, you can use the following commands in terminal:
conda create -n ADTnormPy -c conda-forge r-base r-essentials rpy2
conda activate ADTnormPy
conda install ipykernel
python -m ipykernel install --user --name=ADTnormPy
R
install.packages("remotes")
remotes::install_github("yezhengSTAT/ADTnorm", build_vignettes = FALSE)
quit("no")
Note: if any dependencies fail to install via this command, you will see a warning that installation of packages had a "non-zero exit status". If this occurs, try installing/troubleshooting each dependency individually.
Alternatively, R can be downloaded and installed using CRAN, and ADTnorm can be installed following the instructions here. Running ADTnormPy will also trigger a check that may successfully install the required R packages. Because ADTnorm has many dependencies, installation time may be slow.
Once R and ADTnorm are installed, you can install ADTnormPy.
git clone https://github.com/donnafarberlab/ADTnormPy
cd ADTnormPy
conda activate ADTnormPy
python -m pip install .
Upon loading in packages, you will need to set the location of your R installation. You can see examples of this and other usage examples in this notebook.
Running ADTnorm can be as simple as running:
adtnorm_df = adtnormpy.adtnorm(prot_df,
obs = obs,
sample_column='sample')
and you can provide on top of that any keyword arguments for ADTnorm here.
This includes marker_to_process
to limit normalization to one or a few markers, and many parameters to alter how normalization is performed inlcuding: exclude_zeroes
, peak_type
, bimodal_marker
, trimodal_marker
.
There is also the ability to enable save_fig
which will save figures of the pre- and post- normalized ADT expression distribution for each batch, and save_landmark
which will save detected landmarks (peaks and valleys) in .rds
format. If you use this, there are helper functions built-in to ADTnormPy for loading, converting, and saving these landmarks between R and Python. These can be passed to override_landmark
to have more fine-tune control over landmark detection.
Using customize_landmark=True
you can also activate the Shiny app for careful customization of landmarks.