microsoft/molskill

Model size mismatch when loading defaults

Closed this issue · 4 comments

Hey everyone!

Interesting work!

I have been trying to set the Molskill scorer up, but I ran into the following issue when loading the default models provided in molskill/models/default/checkpoints (both last.ckpt and epoch=129-step=21450.ckpt). The encoder trained seems to have a different size:

Error(s) in loading state_dict for LitRankNet:
	size mismatch for net.encoder.0.weight: copying a param with shape torch.Size([256, 2220])
        from checkpoint, the shape in current model is torch.Size([256, 2221]).

Is this expected?

Thanks for your attention.

Hi - this seems related to the latest release of the rdkit 2023.03.1. I'm currently working on a hotfix but in the meantime please try and downgrade to 2022.03.3

Hi José!

Thanks for the quick answer.

After downgrading to rdkit==2022.03.3, I ran into the following issue. In data/descriptors.py, lines 15-16 read like this:

sys.path.append(os.path.join(RDConfig.RDContribDir, "SA_Score"))
import sascorer  # type: ignore

It seems that the path being appended (which is .../site-packages/rdkit/Contrib) is not available in this particular version of rdkit?

Hi @miguelgondu, are you getting an ImportError here? The Contrib folder should be available in that version of rdkit - if that does not work please open an issue in the rdkit tracker.

In the meantime, I'm preparing a PR to support the latest version of rdkit in #12 - should be ready in a few hours/days.

Hi, the conda packages upstream should be fixed now. Please re-install molskill on a fresh environment via conda/mamba with:

mamba install molskill=*=py39* -c msr-ai4science -c conda-forge

if you're using python 3.9 or remove the build specification if you're using python 3.10.