Basic pretrained models + quickstart instructions?

Question

Basic pretrained models + quickstart instructions?

TylerBalsam opened this issue 3 years ago · 1 comments

Hi there,

Practitioner of several years with deep experience (not much GH footprint in terms of recent experience) in model architecture selection, training, etc. This looks like a fun enough problem for me to work on, not definitely confirmed on it, but looking on potentially picking this up as a side project from when I have spare 'problem-solving' cycles to burn on things for funsies.

I guess in looking through everything there's a possibility to kick everything off from scratch, but I'm a fan of pretrained models, or at least partially-trained models in some form. Are there any S3 dumps of model checkpoints we can finetune from?

Also, in scanning the README, my old training setup is rather aged -- 1070, so the TPU/V100 space on CoLab is probably my go to for now. Do you have a local setup, conda env, docker container, etc that you're working in? Looking to see if there's anything I need to do to standardize the path on my end.

Thanks for putting all of this together. Been working on a few semi-ish related things (not too close, but semi) for a little while, and been watching this repo. Glad to see it's alive and still moving.

If it's of any value, are there branch privs on this repo, and is there any process I should go through to getting those? I'm not looking for anything beyond being able to create/push to individual remote branches, I'd like the PR process and everything that comes with it.

Many thanks, hope you are well,

EF

Answer 1 · 2021-04-13T16:19:25.000Z

Hi there! We're always glad to see new people joining the conversation!

Let's address the outlined points:

Regarding pretrained models: We do not have any pretrained models yet, as the codebase is still a work in progress and some details still need to be tweaked.
As for any "reproducible environment": we do not have a sandard docker image. Since the code is provided a package, most of the packages needed for the components assembled are installed automatically when this code is installed as a package. For some initial tests, there's a notebook which installs the necessary dependencies at the beginning which you could use as a starting point: https://github.com/lucidrains/alphafold2/blob/main/notebooks/egnn_esm_end2end.ipynb
Regarding private branches: the development of the main project is carried out in public, as far as I know, and everyone is encouraged to submit changes in the format of Pull Requests.

We're discussing the developments of the project in the EleutherAI discord (#alphafold channel) which you can join here: https://discord.com/invite/vtRgjbM