pschwllr/MolecularTransformer

add top-1 metrics in the training process

Closed this issue · 1 comments

the acc(accuracy) metrics is not fit for the reaction.

For sometimes, the acc is 90% in the training but the top-1 metric is only 50%.

example: OpenNMT/OpenNMT-py#1776

Yes, what is shown during training, as also stated in the linked issues, is the token-wise accuracy (90% means that 90% of the SMILES tokens are correctly predicted).

But you are right, what we are interested in for reaction prediction is full sequence accuracy. In the Molecular Transformer work, we calculated the full accuracy after making the predictions with the trained model and canonicalising the predicted products.

We did not implement a full sequence accuracy metric to show during training, as we had observed that models with a higher token accuracy typically also performed better on the full sequence accuracy.

Let me know if you have more questions.