fspoendlin/SPACE2

combining formats of "old school" and newer TAP Sabpred pdb files

Closed this issue · 1 comments

Hi,

I attempted to run SPACE2 with a combination of older pdb files from running TAP Sabpred and more modern ones from running it yesterday.

When I do that, I notice that the old files have the CDRs labelled with HETATM entries, e.g. just past the Cysteine in CDR3:

10.pdb-ATOM    731  N   CYS H 104       3.589  -1.890   8.468     0     0
10.pdb-ATOM    732  CA  CYS H 104       4.167  -1.373   9.698     0     0
10.pdb-ATOM    733  C   CYS H 104       5.595  -1.891   9.588     0     0
10.pdb-ATOM    734  O   CYS H 104       5.837  -2.926   8.952     0     0
10.pdb-ATOM    735  CB  CYS H 104       3.483  -1.931  10.953     0     0
10.pdb-ATOM    736  SG  CYS H 104       3.557  -3.729  11.138     0     0
10.pdb:HETATM  737  N   VAL H 105       6.566  -1.210  10.079     0     0
10.pdb:HETATM  738  CA  VAL H 105       7.961  -1.631  10.033     0     0
10.pdb:HETATM  739  C   VAL H 105       8.749  -1.058  11.208     0     0
10.pdb:HETATM  740  O   VAL H 105       8.393  -0.013  11.756     0     0
10.pdb:HETATM  741  CB  VAL H 105       8.596  -1.228   8.711     0     0
10.pdb:HETATM  742  CG1 VAL H 105       8.598   0.288   8.550     0     0
10.pdb:HETATM  743  CG2 VAL H 105      10.013  -1.778   8.609     0     0
10.pdb:HETATM  744  N   ARG H 106       9.822  -1.745  11.588  1.00 1.000
10.pdb:HETATM  745  CA  ARG H 106      10.620  -1.329  12.734  1.00 1.000

Whereas the more modern pdb files don't have that, and have an OXT atom at the end of both chains:

tap.pdb:ATOM    934  OXT SER H 128      21.666   6.063 -19.008  1.00  0.48           O
tap.pdb:ATOM   1808  OXT LYS L 127       2.989 -28.439  -4.346  1.00  0.68           O

Is the HETATM CDR-labelling picked up by the SPACE2 code? Can old and new files be combined?

Thanks in advance.

SPACE2 only parses the columns corresponding to the atom name (all lines with atom name CA are selected) and the chain id (here chains labelled H for the heavy chain and L for the light chain are selected). As long as these columns are correct for your old and new files the code will run correctly.

CDRs labelled HETATOM are not an issue and these residues are still picked up. OXT atoms will not impact the clustering as only C-alpha atoms are used.