Does partialsmiles aims to become a checkcif analogue?
nbehrnd opened this issue · 1 comments
I was/still I am (example_1/example_2) in the search of a well curated checker for consistency of SMILES which would report if a SMILES string were problematic and if so, why the SMILES string is problematic. At the same time, I'm aware programs like OpenBabel issue a warning if necessary e.g.,
$ obabel -:"CC(O/C(\CCC[C@H]([C@H](C(CC=C)=C)OC(O)=C)O)=C(\COC)/OCO)=O" -ocan
==============================
*** Open Babel Warning in CreateCisTrans
Error in cis/trans stereochemistry specified for the double bond
COCC(=C(OC(=O)C)CCC[C@H]([C@H](C(=C)CC=C)OC(=C)O)O)OCO
1 molecule converted
though the sequence of error and warning messages sent to the user may require additional attention by the user (example).
Within Platon, a large set of tests for crystallographic data is prominently available e.g., in IUCr's checkcif service. Errors and inconsistencies identified are grouped by severity, labeled by an error code and a brief explanation. Are there plans to establish a similar, easy to use and recognize web site (tentative name «checksmiles.org») where one may drop a list of SMILES to identify and remove entries not meeting the criteria of your checks?
For some SMILES strings, a check for consistency may be performed with Python based partialsmiles; either with their error tests to be embedded in a try/except clause, or with the enclosed validate.py.
At present, the above SMILES passes the tests, though OpenBabel issues a warning (issue deposit).