lucidrains/alphafold2

How datasets are load ? And how to calculate GDT ?

cuge1995 opened this issue · 3 comments

How datasets are load ? And how to calculate GDT ?

Hi Jinlai! This isn't the official repository, so I'm welcoming discussion and contributors :) For dataset, we can go by https://github.com/aqlaboratory/proteinnet

As for the GDT: it's not a standard metric (it stands for Global DIstance Test, so depending on the distances you choose it will give different values. The conventional approach is to pick either a set of distances that represent global alignment or a subset that measure only small deviations:
https://predictioncenter.org/casp14/doc/help.html (GDT_TS or GDT_HA here).
I'll post a python script to calculate those in a few hours, but calculating from 3d-coordinates shouldn't be hard.

@cuge1995 check the new functions added in #5 for the metrics calculation from 3d coordinates.