openforcefield/openff-toolkit

Atom names lost when adding `_SimpleMolecule`s to a topology

mattwthompson opened this issue · 2 comments

Describe the bug

Atom names (_SimpleAtom.name), and likely other values are, are lost when adding a _SimpleMolecule to a topology because this is lost in the round-trip between its dict representation.

To Reproduce

Take for granted that I have a _SimpleMolecule in memory with non-default atom names.

ipdb> topology.n_atoms
0
ipdb> molecule.atom(0).name
'CH3'
ipdb> topology.add_molecule(molecule)
1
ipdb> topology.atom(0).name
''
ipdb> import copy
ipdb> copy.deepcopy(molecule).atom(0).name
''
ipdb> molecule.to_dict()["atoms"][-1]
{'metadata': {'residue_name': 'NME', 'residue_number': '526', 'insertion_code': ' ', 'chain_id': 'A'}, 'atomic_number': 1}
ipdb> 'name' in molecule.to_dict()["atoms"][-1]
False

Output

Computing environment (please complete the following information):

  • Operating system
  • Output of running conda list

Additional context

Under the hood, Topology.add_molecule calls _SimpleMolecule.__deepcopy__, which goes through the dict representation. There's probably more that's lost.

I'll take a stab at this tomorrow.

This seems like a clear bug and the fix is likely unambiguous. I'll be offline tomorrow, Monday, and Tuesday, so feel free to self-merge and cut a Toolkit release without waiting for my approval!