/awesome-AI-based-protein-design

A collection of research papers for AI-based protein design

Apache License 2.0Apache-2.0

Awesome AI-based Protein Design

This is a collection of research papers for AI-based Protein Design. And the repository will be continuously updated to track the frontier of AI-based protein design.

Welcome to follow and star!

Table of Contents

Overview of Protein Design

AI tools have solved the protein structure prediction problem. This problem derives the spatial structure from the amino acid sequence and achieves atomic-level prediction accuracy, such as AlphaFold 2. It combines previous protein structure prediction models to automatically learn protein design methods, thus truly serving human pharmaceutical needs.

image info1 image info2

The specific practices of protein design vary widely, and the problem definitions applicable to different design processes are also very different. Here are some examples:

  1. A problem of predicting amino acid sequences from spatial structure (the inverse of Alphafold), which assumes that the spatial structure of the desired protein can be derived through molecular dynamics simulations, etc.
  2. A problem of protein structure completion problem for a given partial structure, such as the recent Science [1] by the famous David Baker group. This assumes that only partial structural matches can be found.
  3. Combination of the fitted energy function with MD simulation for protein design, such as the recent Nature [2] of Liu Haiyan's team in China.

In addition, many methods can be used for protein design, and the corresponding AI problem definitions are also very different. This article lists some high-level articles in AI-based protein design, which will be continuously updated in the future.

Papers

format:
- [title](paper link) [links]
  - author1, author2, and author3...
  - publisher
  - keyword

Nature

  • Accurate structure prediction of biomolecular interactions with AlphaFold 3

    • Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, Lindsay Willmore, Andrew J. Ballard, Joshua Bambrick, Sebastian W. Bodenstein, David A. Evans, Chia-Chun Hung, Michael O’Neill, David Reiman, Kathryn Tunyasuvunakool, Zachary Wu, Akvilė Žemgulytė, Eirini Arvaniti, Charles Beattie, Ottavia Bertolli, Alex Bridgland, Alexey Cherepanov, Miles Congreve, Alexander I. Cowen-Rivers, Andrew Cowie, Michael Figurnov, Fabian B. Fuchs, Hannah Gladman, Rishub Jain, Yousuf A. Khan, Caroline M. R. Low, Kuba Perlin, Anna Potapenko, Pascal Savy, Sukhdeep Singh, Adrian Stecula, Ashok Thillaisundaram, Catherine Tong, Sergei Yakneen, Ellen D. Zhong, Michal Zielinski, Augustin Žídek, Victor Bapst, Pushmeet Kohli, Max Jaderberg, Demis Hassabis & John M. Jumper
    • Keyword: Diffusion-based architecture, Protein structure modelling, Biomolecular space modelling
  • A backbone-centred energy function of neural networks for protein design

    • B Huang, Y Xu, X Hu, Y Liu, S Liao, J Zhang, C Huang
    • Keyword: energy function, MD simulation, backbone-centred
  • De novo protein design by deep network hallucination

    • Ivan Anishchenko, Samuel J. Pellock, Tamuka M. Chidyausiku, Theresa A. Ramelot, Sergey Ovchinnikov, Jingzhou Hao, Khushboo Bafna, Christoffer Norn, Alex Kang, Asim K. Bera, Frank DiMaio, Lauren Carter, Cameron M. Chow, Gaetano T. Montelione & David Baker
    • Keywords: hallucination, inpainting, protein design
  • Design of protein-binding proteins from the target structure alone

    • Longxing Cao, Brian Coventry, Inna Goreshnik, Buwei Huang, William Sheffler, Joon Sung Park, Kevin M. Jude, Iva Marković, Rameshwar U. Kadam, Koen H. G. Verschueren, Kenneth Verstraete, Scott Thomas Russell Walsh, Nathaniel Bennett, Ashish Phal, Aerin Yang, Lisa Kozodoy, Michelle DeWitt, Lora Picton, Lauren Miller, Eva-Maria Strauch, Nicholas D. DeBouver, Allison Pires, Asim K. Bera, Samer Halabiya, Bradley Hammerson, Wei Yang, Steffen Bernard, Lance Stewart, Ian A. Wilson, Hannele Ruohola-Baker, Joseph Schlessinger, Sangwon Lee, Savvas N. Savvides, K. Christopher Garcia & David Baker
    • Keywords: binding cite

Nature Biomedical Engineering

Nature Communications

Nature Machine Intelligence

Science

  • Robust deep learning based protein sequence design using ProteinMPNN

    • J. Dauparas, I. Anishchenko, N. Bennett, H. Bai, R. J. Ragotte, L. F. Milles, B. I. M. Wicky, A. Courbet, R. J. de Haas, N. Bethel, P. J. Y. Leung, T. F. Huddy, S. Pellock, D. Tischer, F. Chan, B. Koepnick, H. Nguyen, A. Kang, B. Sankaran, A. K. Bera, N. P. King, D. Baker
    • Keywords: language model, structure prediction
  • Scaffolding protein functional sites using deep learning

    • Jue Wang, Sidney Lisanza, David Juergens, Doug Tischer, Joseph L. Watson, Karla M. Castro, Robert Ragotte, Amijai Saragovi, Lukas F. Milles, Minkyung Baek, Ivan Anishchenko, Wei Yang, Derrick R. Hicks, Marc Expòsit, Thomas Schlichthaerle, Jung-Ho Chun, Justas Dauparas, Nathaniel Bennett, Basile I. M. Wicky, Andrew Muenks, Frank DiMaio, Bruno Correia, Sergey Ovchinnikov, David Baker
    • Keywords: functional site, deep learning, hallucination, inpainting

ICML, ICLR or NeurIPS

Arxiv or bioRxiv

Others

Reference

[1] Wang, Jue, et al. "Scaffolding protein functional sites using deep learning." Science 377.6604 (2022): 387-394.

[2] Huang, Bin, et al. "A backbone-centred energy function of neural networks for protein design." Nature 602.7897 (2022): 523-528.

Contributing

Our purpose is to make this repo even better. If you are interested in contributing, please refer to HERE for instructions in contribution.

License

awesome AI-based protein design is released under the Apache 2.0 license.