RL-MLZerD

Citation:

@article{,
  title={Multimeric protein docking using reinforcement learning.},
  author={Tunde Aderinwale, Charles Christoffer, and Daisuke Kihara},
  journal={}
}

What is RL-MLZerD?

RL-MLZERD is a general RL framework for docking multimeric protein complex.

The rank and selected pairwise subunit docking by LZERD is the set of actions available to the agent. The set of different combinations of chains as they are being assembled is defined as the state for the agent. The assembled models are evaluated, and rewards are assigned based on the model quality.

Framework Flowchart

There are 4 major steps performed by the RL-MLZerD agent:

  • State Model: Correspond to the different assemble state of chains, with complete assemble being the terminal state
  • Action Model: Correspond to the pairwise poses available for assemble (In this work we set all possible actions to be 1,000 - [i.e 1k different pairwise poses])
  • QA Model: This is where the assembled model is evaluated for quality and other protein like properties
  • Reward Model: The RL-MLZerD agent is assigned a reward based on the model quality. Which serves as a signal back to it as it explore and exploit the available state/action environment.

Dependencies

  • scipy: pip/conda install scipy==0.12
  • BioPython: pip/conda install biopython==3.7.0
  • Numpy: pip/conda install numpy

Usage

python3 run_qagent.py: 
  --protein name of target for docking. i.e 1A0R 
  --nofchains number of chains. i.e 3 
  --chains name of each chains. i.e BGP
  --path this is the path to the directory containing data directory, i.e ./  
  --episodes number of episodes to simulate. i.e 50000
  --pool_size size of parwise pool, corresponding to number of available action. i.e 1000
  --out_dir output directory for results. i.e 1A0R_docking
  --classifier which classifier to use, i.e sgd
  --clash_threshold  threshold for allowed clashes in a model. Defaults to 300. Advisable to increase it in proportional to # of chains

   e.g python3 run_qagent.py --protein 1A0R --nofchains 3 --chains BGP --path ./ --episodes 1000 --out_dir testing_afm_dock --not_int_pair "" --int_pair "" (this will run the docking for the example complex[1A0R] that we have in the data directory)

After simulation, to generate the pdb files for all the assembled and accepted structure run:

python3 run_qagent.py: 
  python3 generate_models output_path chains
  - output_path : this is the path to where the output file generated by the RL-MLZerD agent is saved.
  - chains : this is the chain list for the pdb, i.e BGP

License

MIT License Copyright (c) 2021 Tunde Aderinwale, Charles Christoffer, Daisuke Kihara, and Purdue University