peteboyd/lammps_interface

Improve CIF parser to accept symmetric MOFs.

Opened this issue · 7 comments

I think the CIF parser can be improved to accept non-P1 MOFs. ASE Python library has a pretty good CIF parser so that can be used to read MOFs or even other crystals. Moreover, it can read many file formats, allowing other file formats to be accepted. I would be willing to work on this.

Has there been any work on this? This would also fix #1

I don't think so - right @peteboyd ?

This is something that would be quite simple to add, in particular once we have the first basic continuous integration tests in place (which would guard against unintended consequences of the shift)

@peteboyd It turns out Henglu just stumbled upon this as well.

Would you be open to adding a dependency on ASE to get a fully fledged CIF parser?

I see that you already thought about this in

def ase_from_CIF(cifname):
'''
Try to read the cif file using the ase environment. They have considerations for space groups.
We don't.
'''
from ase.io import read
ase_cif = read(cifname)
mg = MolecularGraph(name=clean(cifname))
for atom in ase_cif:
print(atom)

Is there more to do than make ase_from_CIF return the MolecularGraph and the cell like from_CIF does?
Does lammps-interface use aspects of the CIF file other than elements and positions of the atoms?

If it's just the above, I think it would be quite straightforward to do.

The only thing that would require a little attention is when the code reads in bonding information from the CIF file. I'm not sure if that part of the code is still relevant, but it would have to be investigated before proceeding with the ASE implementation.

Reading the bonding information from CIF is absolutely useful when we have structures generated using one of the topology-based structure generators.
I suggest switching to ASE for the general cases and adding a new argument input when one needs to read the bonding topology from CIF (keeping the current pieces of the code).
This has an extra advantage as ASE periodic distance computation is much faster than the one in lammps interface --> it allows dealing with larger MOFs.

Reading the bonding information from CIF is absolutely useful when we have structures generated using one of the topology-based structure generators.
I suggest switching to ASE for the general cases and adding a new argument input when one needs to read the bonding topology from CIF (keeping the current pieces of the code).
This has an extra advantage as ASE periodic distance computation is much faster than the one in lammps interface --> it allows dealing with larger MOFs.

Thanks - good point about the weird bonding cases that come from crystal building programs. Bonding can be dealt with using symmetry operations, we just have to be careful about it. I recall the notation in the CIF format is a bit strange.