kiharalab/DOVE

Ligand channel slicing does not take into account the atom type

Closed this issue · 1 comments

Issue description

for i,item in enumerate(llist):
xo=int((float(item[0])-xmean)/divide)
yo=int((float(item[1])-ymean)/divide)
zo=int((float(item[2])-zmean)/divide)
Status=True
atom_type=llist[i]
if xo<=-half_cut or xo>=half_cut or yo<=-half_cut or yo>=half_cut or zo<=-half_cut or zo>=half_cut:
Status=False
if Status:
count_use+=1
if atom_type=='C' or atom_type=='CA':
tempinput[xo+half_cut,yo+half_cut,zo+half_cut,0,0]=tempinput[xo+half_cut,yo+half_cut,zo+half_cut,0,0]+1
tempinput[-xo+half_cut,-yo+half_cut,zo+half_cut,0,1]=tempinput[-xo+half_cut,-yo+half_cut,zo+half_cut,0,1]+1
tempinput[-yo+half_cut,xo+half_cut,zo+half_cut,0,2]=tempinput[-yo+half_cut,xo+half_cut,zo+half_cut,0,2]+1
tempinput[yo+half_cut,-xo+half_cut,zo+half_cut,0,3]=tempinput[yo+half_cut,-xo+half_cut,zo+half_cut,0,3]+1
count_C+=1

The assignment on Line 355,

atom_type=llist[i]

results in a numpy.ndarray being stored in atom_type., and == comparisons of this to 'C', 'CA', 'N', and 'O' will always return False, which means that slicing logic will always go to the else clause on Lines 378-384.

Adding print(f"Atom type = {atom_type}") after this line and operating on Web/Example/Correct.pdb gives:

waiting dealing1
     1  888_goap.pdb                      -59665.59    -31894.54   -27771.05
     1  complex.888.pdb                   -59665.59    -31894.54   -27771.05
in total, we have 210 residues in receptor, 122 residues in ligand
in the interface 10A cut off, we have 63 residue, 550 atoms in the receptor
in the interface 10A cut off, we have 42 residue, 354 atoms in the ligand
after processing, we only remained 550 atoms in receptor, 354 atoms in ligand
271 atoms actually used in this receptor
Atom type = [37.801 -9.438 -1.106]
...DOVE/data_processing/prepare_input.py:361: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if atom_type=='C' or atom_type=='CA':
...DOVE/data_processing/prepare_input.py:367: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  elif atom_type=='N':
...DOVE/data_processing/prepare_input.py:373: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  elif atom_type=='O':
Atom type = [ 3.7409e+01 -8.5740e+00 -4.0000e-03]
Atom type = [37.88  -9.18   1.306]
Atom type = [37.162 -9.808  2.091]
Atom type = [35.89  -8.471  0.066]
Atom type = [35.371 -7.599  1.205]
Atom type = [36.211 -6.872  1.765]
Atom type = [34.158 -7.598  1.429]
Atom type = [35.294 -8.933  9.703]
Atom type = [33.988 -8.302  9.876]
Atom type = [33.293 -8.917 11.063]
Atom type = [ 33.264 -10.107  11.262]
Atom type = [33.2   -8.519  8.58 ]
Atom type = [31.93  -7.697  8.538]
Atom type = [31.191 -7.927  7.214]
Atom type = [32.057 -7.52   6.136]
Atom type = [32.325 -7.919  4.888]
Atom type = [31.715 -8.907  4.291]
Atom type = [33.239 -7.247  4.178]
...
*****Please contact me for details: wang3702@purdue.edu*****
['Correct.pdb', 0.9226311, 0.9772231, 0.25234342, -1, 0.56493485, -1, -1, -1]

Although the high probability scores matches the example in the front page README,

image

there should be clarification whether this is a result of the intention of the algorithm.


Suggested issue resolution

I believe that Line 355 should instead be

atom_type=llist2[i]

in which case, the output is:

waiting dealing1
     1  888_goap.pdb                      -59665.59    -31894.54   -27771.05
     1  complex.888.pdb                   -59665.59    -31894.54   -27771.05
in total, we have 210 residues in receptor, 122 residues in ligand
in the interface 10A cut off, we have 63 residue, 550 atoms in the receptor
in the interface 10A cut off, we have 42 residue, 354 atoms in the ligand
after processing, we only remained 550 atoms in receptor, 354 atoms in ligand
271 atoms actually used in this receptor
Atom type = N
Atom type = CA
Atom type = C
Atom type = O
Atom type = CB
Atom type = CG
Atom type = OD1
Atom type = OD2
Atom type = N
Atom type = CA
Atom type = C
Atom type = O
Atom type = CB
Atom type = CG
Atom type = CD
Atom type = NE
Atom type = CZ
Atom type = NH1
...
*****Please contact me for details: wang3702@purdue.edu*****
['Correct.pdb', 0.55048513, 5.8436563e-06, 0.25234342, -1, 0.5078626, -1, -1, -1]

But this leads to much lower probability scores in the Web/Example/Correct.pdb example.

Thanks a lot for you pointing out this! This is my mistake! This means 4 channel results are put into one channel. That somehow I believe decreases the performance.