Ensemble tumor neoantigen prediction and multi-parameter quality analysis from direct input, SNVs, indels, or gene fusion variants.
An R package for neoantigen analysis that takes human or murine DNA missense mutations, insertions, deletions, or RNASeq-derived gene fusions and performs ensemble neoantigen prediction using 7 algorithms. Input is a VCF file, JAFFA output, or table of peptides or transcripts. Outputs are ranked and summarized by sample. Neoantigens are ranked by MHC I/II binding affinity, clonality, RNA expression, similarity to known immunogenic antigens, and dissimilarity to the normal peptidome.
- Thoroughness:
- missense mutations, insertions, deletions, and gene fusions
- human and mouse
- ensemble MHC class I/II binding prediction using mhcflurry, mhcnuggets, netMHC, netMHCII, netMHCpan and netMHCIIpan
- ranked by
- MHC I/II binding affinity
- clonality
- RNA expression
- similarity to known immunogenic antigens
- dissimilarity to the normal peptidome
- Speed and simplicity:
- 1000 variants are ranked in a single step in less than five minutes
- parallelized using
parallel::mclapply
and data.table::setDTthreads, see respective links for information on setting multicore usage
- Integration with R/Bioconductor
- upstream/VCF processing
- exploratory data analysis, visualization
Three methods exist to run antigen.garnish
:
- Docker
- Linux
- Amazon Web Services
docker pull leeprichman/antigen_garnish
See the wiki for instructions to run the Docker container.
- R ≥ 3.4
- python-pip
- tcsh (required for netMHC)
sudo
privileges (required for netMHC)
The following line downloads and runs the initial installation script.
$ curl -fsSL http://get.rech.io/install_antigen.garnish.sh | sudo sh
Next, download the netMHC suite of tools for Linux, available under an academic license:
After downloading the files above, move the binaries into the antigen.garnish
data directory, first setting the NET_MHC_DIR
and ANTIGEN_GARNISH_DIR
environmental variables, as shown here:
NET_MHC_DIR=/path/to/folder/containing/netMHC/downloads
ANTIGEN_GARNISH_DIR=/path/to/antigen.garnish/data/directory
cd "$NET_MHC_DIR" || return 1
mkdir -p "$ANTIGEN_GARNISH_DIR/netMHC" || return 1
find . -name "netMHC*.tar.gz" -exec tar xvzf {} -C "$ANTIGEN_GARNISH_DIR/netMHC" \;
chown "$USER" "$ANTIGEN_GARNISH_DIR/netMHC"
chmod 700 -R "$ANTIGEN_GARNISH_DIR/netMHC"
See the wiki for instructions to create an Amazon Web Services instance.
Package documentation can be found: website, pdf.
-
Prepare input for MHC affinity prediction and quality analysis:
- VCF input -
garnish_variants
- Fusions from RNASeq via JAFFA-
garnish_jaffa
- Prepare table of direct transcript or peptide input - see manual page in R (
?garnish_affinity
)
- VCF input -
-
Add MHC alleles of interest - see examples below.
-
Run ensemble prediction method and perform antigen quality analysis including proteome-wide differential agretopicity, IEDB alignment score, and dissimilarity:
garnish_affinity
. -
Summarize output by sample level with
garnish_summary
andgarnish_plot
, and prioritize the highest quality neoantigens per clone and sample withgarnish_antigens
.
library(magrittr)
library(data.table)
library(antigen.garnish)
# load an example VCF
dir <- system.file(package = "antigen.garnish") %>%
file.path(., "extdata/testdata")
dt <- "antigen.garnish_example.vcf" %>%
file.path(dir, .) %>%
# extract variants
garnish_variants %>%
# add space separated MHC types
# see list_mhc() for nomenclature of supported alleles
# MHC may also be set to "all_human" or "all_mouse" to use all supported alleles
.[, MHC := c("HLA-A*01:47 HLA-A*02:01 HLA-DRB1*14:67")] %>%
# predict neoantigens
garnish_affinity
# summarize predictions
dt %>%
garnish_summary %T>%
print
# generate summary graphs
dt %>% garnish_plot
library(magrittr)
library(data.table)
library(antigen.garnish)
# load example jaffa output
dir <- system.file(package = "antigen.garnish") %>%
file.path(., "extdata/testdata")
path <- "antigen.garnish_jaffa_results.csv" %>%
file.path(dir, .)
fasta_path <- "antigen.garnish_jaffa_results.fasta" %>%
file.path(dir, .)
# get predictions
dt <- garnish_jaffa(path, db = "GRCm38", fasta_path) %>%
# add MHC info with list_mhc() compatible names
.[, MHC := "H-2-Kb"] %>%
# get predictions
garnish_affinity %>%
# summarize predictions
garnish_summary %T>%
print
library(magrittr)
library(data.table)
library(antigen.garnish)
# load example Microsoft Excel file
dir <- system.file(package = "antigen.garnish") %>%
file.path(., "extdata/testdata")
path <- "antigen.garnish_test_input.xlsx" %>%
file.path(dir, .)
# predict neoantigens
dt <- garnish_affinity(path = path) %T>%
str
library(magrittr)
library(data.table)
library(antigen.garnish)
# generate our character vector of sequences
v <- c("SIINFEKL", "ILAKFLHWL", "GILGFVFTL")
# calculate IEDB score
v %>% iedb_score(db = "human") %>% print
# calculate dissimilarity
v %>% garnish_dissimilarity(db = "human") %>% print
From ./<Github repo>
:
devtools::test(reporter = "summary")
library(magrittr)
library(data.table)
library(antigen.garnish)
# generate a fake peptide
dt <- data.table::data.table(
pep_base = "Y___*___THIS_IS_________*___A_CODE_TEST!______*__X",
mutant_index = c(5, 25, 47, 50),
pep_type = "test",
var_uuid = c(
"front_truncate",
"middle",
"back_truncate",
"end")) %>%
# create nmers
make_nmers %T>% print
garnish_plot
output:
garnish_antigens
output:
Richman LP, Vonderheide RH, and Rech AJ. "Neoantigen dissimilarity to the self-proteome predicts immunogenicity and response to immune checkpoint blockade." Cell Systems 9, 375-382.E4, (2019).
We welcome contributions and feedback via Github or email.
We thank the follow individuals for contributions and helpful discussion:
Please see LICENSE.