prody/ProDy

float division by zero exception raised if using prody_align with mmCIF files

Opened this issue · 1 comments

Hi,

I'm trying to use the prody_align function to align two selections and if I use mmCIF files as input the calcRMSD function would return a float divison by zero exception while if I use the pdb files everything works perfectly.

Code snippet to replicate the bug:

import prody
from prody.apps.prody_apps.prody_align import prody_align

# mmCIF files
prody.fetchPDB("2reg", format="cif", compressed=False)
prody.fetchPDB("2rin", format="cif", compressed=False)

# PDB files
prody.fetchPDB("2reg", compressed=False)
prody.fetchPDB("2rin", compressed=False)

selection_string = 'ca sequence "W.*D.*W.*M.*Y.*I.*ND.*E.*W.*H"'
# Works fine
prody_align("2reg.pdb", "2rin.pdb", select=selection_string, model=1, sequid=90, overlap=90, prefix='aligned_')

# float division by zero
prody_align("2reg.cif", "2rin.cif", select=selection_string, model=1, sequid=90, overlap=90, prefix='aligned_')

Please let me know if I'm doing something wrong or if this is effectively a bug and in such case if you plan to fix it.

Thank you for your effort in providing such a broad package!

I’d imagine that the problem is that the chains are different in the cif file with the default way of parsing it.

We have an option unite_chains in parsePDB and parseMMCIF that can change the default behaviour if you set it to true. You can then use alignChains on the resulting AtomGroup objects instead of prody_align and everything should hopefully work.

We should also modify prody_align to accept the unite_chains option to pass on to parsePDB and parseMMCIF.

Thanks for letting us know