no stereoisomer included in the dataset dataset_v1.csv
Closed this issue · 2 comments
zhengmin317 commented
The dataset dataset_v1.csv does not contain any character "/", "" or "@" (stereoisomers).
Why stereoisomers are not included in the dataset?
danpol commented
The current version of a dataset indeed stores non-isomeric SMILES. Seems like a good idea to construct a dataset_v1_isomeric.csv for additional experiments with isomeric SMILES. We'll add it soon, but for now, you can launch prepare_dataset.py
script and change this line: https://github.com/molecularsets/moses/blob/master/scripts/prepare_dataset.py#L42 to isomericSmiles=True