baoilleach/partialsmiles

Does partialsmiles aims to become a checkcif analogue?

nbehrnd opened this issue · 1 comments

I was/still I am (example_1/example_2) in the search of a well curated checker for consistency of SMILES which would report if a SMILES string were problematic and if so, why the SMILES string is problematic. At the same time, I'm aware programs like OpenBabel issue a warning if necessary e.g.,

$ obabel -:"CC(O/C(\CCC[C@H]([C@H](C(CC=C)=C)OC(O)=C)O)=C(\COC)/OCO)=O" -ocan
==============================
*** Open Babel Warning  in CreateCisTrans
  Error in cis/trans stereochemistry specified for the double bond

COCC(=C(OC(=O)C)CCC[C@H]([C@H](C(=C)CC=C)OC(=C)O)O)OCO	
1 molecule converted

though the sequence of error and warning messages sent to the user may require additional attention by the user (example).

Within Platon, a large set of tests for crystallographic data is prominently available e.g., in IUCr's checkcif service. Errors and inconsistencies identified are grouped by severity, labeled by an error code and a brief explanation. Are there plans to establish a similar, easy to use and recognize web site (tentative name «checksmiles.org») where one may drop a list of SMILES to identify and remove entries not meeting the criteria of your checks?

For some SMILES strings, a check for consistency may be performed with Python based partialsmiles; either with their error tests to be embedded in a try/except clause, or with the enclosed validate.py.

At present, the above SMILES passes the tests, though OpenBabel issues a warning (issue deposit).