Full reproducibility scripts and data for the paper "Organism Specific Data Sets Improve Linear B-Cell Epitope Prediction Performance".
The scripts contained in this folder make use of the epitopes package, version 0.5.1, which can be installed into R using:
devtools::install_github("fcampelo/epitopes", ref = "v0.5.1-OrgSpec-paper")
Our results were generated with the following setup (taking advantage of some parallel processing capabilities of the epitopes package):
R version 4.0.5 (2021-03-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16
- Download or clone this repository locally.
- For each pathogen:
- Set the target organism folder (under directory Experiments) as the working directory.
- Execute routine
01_generate_datasets
- Generate predictions using the benchmark predictors (ABCPred, Bepipred2, etc.) for the hold-out proteins (under subfolder data/splits of your working directory) and save them to the appropriate folders under subfolder output.
- Run routine
Check_leakage
(under Experiments). - For each pathogen:
- Execute routine
02_run_experiment
- Execute routine
- Run routine
Consolidate_results
(under Experiments).