wengong-jin/RefineGNN

What kind of input data does this take?

sgbaird opened this issue · 3 comments

I'm not as familiar with antibody data. Is it via a molecular representation? Something else?

It's usually represented in PDB format, which contains a list of 3D coordinates. here is one example of a CDR predicted by our model:

ATOM    924  CA  GLY H 25     191.794 190.047 206.647  1.00  4.89           C
ATOM    924  CA  TYR H 26     192.040 192.706 203.667  1.00  4.89           C
ATOM    924  CA  THR H 27     192.460 191.833 199.930  1.00  4.89           C
ATOM    924  CA  LEU H 28     195.918 191.006 199.066  1.00  4.89           C
ATOM    924  CA  THR H 29     195.885 192.857 195.298  1.00  4.89           C
ATOM    924  CA  ASP H 30     195.098 196.273 197.270  1.00  4.89           C
ATOM    924  CA  PHE H 31     198.497 196.488 199.171  1.00  4.89           C
ATOM    924  CA  TYR H 32     202.330 195.512 198.278  1.00  4.89           C
ATOM    924  CA  LEU H 50     206.210 192.332 195.568  1.00  4.89           C
ATOM    924  CA  ASN H 51     202.758 193.628 193.856  1.00  4.89           C
ATOM    924  CA  PRO H 52     200.431 190.566 194.922  1.00  4.89           C
ATOM    924  CA  HIS H 53     198.004 191.766 191.911  1.00  4.89           C
ATOM    924  CA  SER H 54     200.610 191.288 188.781  1.00  4.89           C
ATOM    924  CA  GLY H 55     203.772 189.334 190.801  1.00  4.89           C
ATOM    924  CA  GLY H 56     205.889 192.496 189.880  1.00  4.89           C
ATOM    924  CA  THR H 57     208.906 192.607 192.483  1.00  4.89           C
ATOM    924  CA  VAL H 96     205.195 193.898 206.108  1.00  4.89           C
ATOM    924  CA  ARG H 97     202.119 195.293 204.433  1.00  4.89           C
ATOM    924  CA  SER H 98     202.606 198.625 202.750  1.00  4.89           C
ATOM    924  CA  ASP H 99     199.382 200.759 203.392  1.00  4.89           C
ATOM    924  CA  GLN H 100     199.649 201.402 199.666  1.00  4.89           C
ATOM    924  CA  GLU H 101     196.964 202.615 197.136  1.00  4.89           C
ATOM    924  CA  ALA H 102     200.303 204.610 195.979  1.00  4.89           C
ATOM    924  CA  LEU H 103     201.110 206.516 199.309  1.00  4.89           C
ATOM    924  CA  ARG H 104     205.053 205.513 199.777  1.00  4.89           C
ATOM    924  CA  GLY H 105     203.117 204.768 202.111  1.00  4.89           C
ATOM    924  CA  ALA H 106     204.642 203.016 205.018  1.00  4.89           C
ATOM    924  CA  PHE H 107     204.173 199.593 206.507  1.00  4.89           C
ATOM    924  CA  ASP H 108     200.822 199.696 208.539  1.00  4.89           C
ATOM    924  CA  ILE H 109     200.983 196.126 209.815  1.00  4.89           C

@wengong-jin thank you!

I'm not familiar with protein structure. I'm curious about how to extract the information needed in the .json file, containing seq, cdr and coords from such a PDB file?