About the generation process
Closed this issue · 2 comments
Hello, and great job!
1、When generating the 10K molecules in Table 1、Table2, or Table3, etc.,
should we input some molecules, are they from the ZINK250K or MOSES?
OR
SGDS generates molecules in a manner of sampling from the latent vector
i.e., the inference process is from
2、I have this question because that other method such as LIMO in Table 1, generate molecules by giving an input from ZINK250K, and gives a better molecule with higher QED property.
So their generation process is actually an optimization process, does SGDS the same as them?
Thank you very much!
Thanks for your questions.
-
In generation, we use
$p(z|y)$ , which only takes the value$y$ as input. In the experiment, we initialize$y$ as the values in the 10k test set. However, it should be ok to start from other value$y$ , such as the 10k high values in the training set. -
In the optimization experiments, the generated molecules are only conditioned on the property values, i.e.
$p(x|y)$ . However, in the structure-constrained experiment, we may need to improve the property based on a given molecule backbone,$p(x|y, \tilde{x})$ , which is not studied in SGDS.
Got it, thank you!