molecularsets/moses

How to generate compounds from single smiles string?

abhik1368 opened this issue · 7 comments

Can you help with the generation of compounds from a single smiles files ?

@abhik1368, I'm not sure that I understand what "generation from a single smiles files" means. Could you please elaborate on this?

@abhik1368, I'm not sure that I understand what "generation from a single smiles files" means. Could you please elaborate on this?

If i want to use a simles string how can i generate compounds, which are the functions to encode and decode ?

If I understood you correctly, you want to manually call encode/decode functions for a trained model? For example, in AAE you can run model.encoder(x) and model.decoder(x) respectively. Here, x is one-hot encoding of a SMILES string. You can get it using collate function: collate = aae_trainer.get_collate_fn(aae). Then you can get x using collate(list_of_smiles). In general, all code for doing this is similar to the training loop of AAE model.

Ok i understand but how to generate a string from a latent space with specified standard deviation/noise . Just like in chemvae

I think the function you are looking for is https://github.com/molecularsets/moses/blob/master/moses/aae/model.py#L138 — it samples molecules in a SMILES format. Currently, there is no way to directly specify the exact latent code for sampling.

@abhik1368
I have the same question as you. Did you solve this problem? If yes, would mind sharing it? Many thanks

nope