project-gemmi/gemmi

pdbx_PDB_model_num type & content

rimmartin opened this issue · 2 comments

Hi @wojdyr and everybody,

pdbx_PDB_model_num is an int [+-]?[0-9]+?
https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site.pdbx_PDB_model_num.html

When an mm cif is written out the pdbx_PDB_model_num gets set to either a string with/without quotes depending on the characters. When we read back in then the pdbx_PDB_model_num is not an int unless the first character is a number for example 4o9s. When not a numeric the resulting structure is messed up but I haven't traced completely why; some hydrogen coordinates are missing or not reachable.

https://github.com/project-gemmi/gemmi/blob/master/src/to_mmcif.cpp#L85 sets the pdbx_PDB_model_num column & https://github.com/project-gemmi/gemmi/blob/master/src/to_mmcif.cpp#L151

   vv.emplace_back(string_or_qmark(model.name));

Should not expect this to be a model num or ordeal counting up thru the models in one cif?

Yes, pdbx_PDB_model_num must be a number, it's a mistake that it's stored as string Model::name.
I've been aware of it for some time. It should be changed to, say, int Model::num. Unfortunately, it'd break some third-party programs, so I'm waiting with this change to a "major" new release. Perhaps 0.7.0.

For now, don't set model.name to a string that's not integer.

Ah thanks @wojdyr, I'll work around it till you get third parties happy