lucidrains/alphafold2

Architecture

lucidrains opened this issue ยท 21 comments

Share what you think you know about the architecture here

Here is the Democratized Implementation of Alphafold Protein Distance Prediction Network:
https://github.com/dellacortelab/prospr

https://moalquraishi.wordpress.com/2020/12/08/alphafold2-casp14-it-feels-like-ones-child-has-left-home/ Mohammed AlQuraishi believes it is SE(3)-Transformers as well. We'll go by that

After reviewing where we are at with equivariant attention networks, they seem costly and brittle (at least, the current implementations) why can't we just work off the distogram https://github.com/lucidrains/molecule-attention-transformer and get the equivariance for free?

There is the possibility that there is some iterative refinement process going on similar to Xu's paper. Most possibly with Graph Attention Networks https://www.biorxiv.org/content/10.1101/2020.12.10.419994v1.full.pdf

https://arxiv.org/abs/2012.10885 Lie Transformers, may be an alternative to SE(3)-Transformers, for equivariant attention. will be built here, just in case it is needed https://github.com/lucidrains/lie-transformer-pytorch

Hi there! I would like to actively contribute to/guide this project! I have a background in AI/ML as well as Medicine and Physics (plus i did a mini repro of AlphaFold1 2 years ago).
Here there's a good piece of comment about the architecture from one of the CASP staff members: https://moalquraishi.wordpress.com/2020/12/08/alphafold2-casp14-it-feels-like-ones-child-has-left-home/#s3
I could provide some help understanding the features if someone has trouble with that.

@lucidrains i want to push this project forward. I've asked for it on eleuther, if there's any way to move this, i'll be glad to help!

@hypnopump Hi Eric, would be happy to have you on board!

@hypnopump Yup, I'm working on equivariant self-attention this month :) Will get it done ๐Ÿ‘

@lucidrains cool! anything i can help with, you say it

@hypnopump want to chat in off topic in the eleuther discord?

Will be working on getting reversible networks working with the main trunk (cross attention between two transformers) if it works, it means scale to any depth with no memory cost. Should be done by end of week

Reversible networks is done, which means we can now scale to any depth for the main trunk, with only the memory cost of one layer

Next up, bringing in sparse attention techniques + relative positional encoding

Ok, sparse attention is done. I'll also bring in sparse attention in a convolutional pattern (https://github.com/lucidrains/DALLE-pytorch/blob/main/dalle_pytorch/attention.py#L75) some time next week

Lie Transformers is confirmed to be correct by one of the authors, so we have two equivariant attention solutions at our disposal

Closing because I think we are close