wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh
conda create --name alphafold python==3.8
conda update -n base conda
conda activate alphafold
conda install -y -c anaconda cudnn==8.2.1
conda install -y -c bioconda hmmer hhsuite==3.3.0 kalign2
conda install -y -c conda-forge openmm==7.5.1 cudatoolkit==11.0.3 pdbfixer
Download alphafold git repo
git clone https://github.com/deepmind/alphafold.git
alphafold_path = "/path/to/alphafold/git/repo"
wget -q -P alphafold/alphafold/common/ https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt
Note: jax updgrade: cuda111 supports cuda 11.3 - jax-ml/jax#6628
pip install absl-py==0.13.0 biopython==1.79 chex==0.0.7 dm-haiku==0.0.4 dm-tree==0.1.6 immutabledict==2.0.0 jax==0.2.14 ml-collections==0.1.0 numpy==1.19.5 scipy==1.7.0 tensorflow==2.5.0
pip install --upgrade jax jaxlib==0.1.69+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html
cd ~/anaconda3/envs/alphafold/lib/python3.8/site-packages/ && patch -p0 < $alphafold_path/docker/openmm.patch
- Use this bash script
Usage: run_alphafold.sh <OPTIONS>
Required Parameters:
-d <data_dir> Path to directory of supporting data
-o <output_dir> Path to a directory that will store the results.
-m <model_names> Names of models to use (a comma separated list)
-f <fasta_path> Path to a FASTA file containing one sequence
-t <max_template_date> Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets
Optional Parameters:
-b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many
proteins (default: 'False')
-g <use_gpu> Enable NVIDIA runtime to run with GPUs(default: 'True')
-a <gpu_devices> Comma separated list of devices to pass to 'NVIDIA_VISIBLE_DEVICES' (default: 'all')
-p <preset> Choose preset model configuration - no ensembling (full_dbs) or 8 model ensemblings (casp14) (default: 'full_dbs')
- This script needs to be put into the top directory of the alphafold git repo that you have downloaded
# Directory structure
alphafold
├── alphafold
├── CONTRIBUTING.md
├── docker
├── example
├── imgs
├── LICENSE
├── README.md
├── requirements.txt
├── run_alphafold.py
├── run_alphafold.sh <--- Copy the bash script and put it here
├── run_alphafold_test.py
├── scripts
└── setup.py
- Put your query sequence (only one sequence per fasta file) in a fasta file <filename.fasta>. Query sequence was obtained from here
- Run the script
# Example run
bash run_alphafold.sh -d ./alphafold_data/ -o ./dummy_test/ -m model_1 -f ./example/query.fasta -t 2020-05-14
- The results folder
dummy_test
can be found in this git repo along with the query (example/query.fasta
) used - For further information refer here.
- The arguments to the script follows the original naming of the alphafold parameters, except for
fasta_paths
. This script can do only one fasta query at a time. So use a tmux session or so to do multiple runs. - Happy folding!
- We do not guarantee that this will work for everyone
- The non-docker version was tested with the following system configuration
- Dell server
- CPU: AMD EPYC 7601 2.2 GHz
- RAM: 1 TB
- GPU: NVIDIA Tesla V100 16G
- OS: CentOS 7 (kernel 3.10.0-1160.24.1.el7.x86_64)
- Cuda: 11.3
- NVIDIA driver version: 470.42.01
- Dell server