Loading model using DGMR.from_pretrained("openclimatefix/dgmr") fails
Opened this issue · 6 comments
Describe the bug
When loading the pretrained model from HuggingFace as per the instructions in README.md
I get a KeyError: 'config'
exception. The issue is that when following the instructions **model_kwargs
will be empty in NowcastingModelHubMixin._from_pretrained
, but it then attempts to pass **model_kwargs['config']
to the class constructor in
skillful_nowcasting/dgmr/hub.py
Line 154 in 8c40296
To Reproduce
>>> from dgmr import DGMR
>>> model = DGMR.from_pretrained("openclimatefix/dgmr")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/hub_mixin.py", line 420, in from_pretrained
instance = cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "/home/me/venv/dgmr/git/skillful_nowcasting/dgmr/hub.py", line 154, in _from_pretrained
model = cls(**model_kwargs["config"])
~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'config'
Expected behavior
I expect to have a variable model
with the pretrained DGMR
model.
Additional context
One can trivially work around the problem by passing an empty config
argument when loading the pretrained model:
model = DGMR.from_pretrained("openclimatefix/dgmr", config={})
so maybe this is just a matter of updating README.md
.
Still, I'd argue that the cleaner fix would be to pass the entire **model_kwargs
to the class constructor, thus
model = cls(**model_kwargs)
That would also coincide with how HuggingFace themselves do it in https://github.com/huggingface/huggingface_hub/blob/855ee997202e003b80bafd3c02dac65146c89cc4/src/huggingface_hub/hub_mixin.py#L633.
Given that I was baffled that nobody seemed to have reported such a basic problem earlier, I investigated this issue further and, not surprisingly, it is a bit more complicated than I initially though.
It turns out the issue surfaced only recently due to changes introduced in huggingface_hub >= 0.22.0
. So for the moment the best workaround is to downgrade huggingface_hub
to version 0.21.4
:
pip3 install huggingface-hub==0.21.4
Prior to huggingface/huggingface_hub@45147c5, config
would be injected in the model_kwargs
if the model class accepted a **kwargs
argument. From version 0.22.0
onwards, the logic for when config
is injected has changed . Looking at the code and the docs at
https://huggingface.co/docs/huggingface_hub/main/en/guides/integrations#a-more-complex-approach-class-inheritance, I believe that config
is injected if (a) the model class expects a parameter config
in its __init__
, and/or (b)
the model class expects a parameter config
in its _from_pretrained
.
More relevantly though, if the model class accepts a **kwargs
argument in its __init__
, then the contents of the config are injected one by one in the **model_kwargs
(as opposed to injecting the config as-is in **model_kwargs['config']
). This setting actually seems more applicable to me in NowcastingModelHubMixin
: wouldn't it be sufficient to replace model = cls(**model_kwargs['config'])
with model = cls(**model_kwargs)
in _from_pretrained
? And wouldn't that also allow to simplify DGMR.__init__
?
Thanks @agijsberts for finding this. Should we pin the requirements file to huggingface-hub==0.21.4
. Would you be interested in making a PR for this?
Would you be able to expand further on the the model_kwargs
point? Perhaps even some links to the code where it could be improved?
Pinning huggingface-hub==0.21.4
is a sensible stopgap in my opinion. I'll prepare a PR right away.
Regarding the model_kwargs
and without going into too much detail: a significant chunk of DGMR.__init__
is dedicated to dealing with a potential config
parameter and, when passed, to use it to override the normal arguments. If I understood the workings of ModelHubMixin
correctly, it should no longer be necessary to rely on such a "magic" config
argument and removing the logic would make DGMR.__init__
agnostic to the mixin.
I volunteer to have a look at what would be the best way to resolve this.
Thanks @agijsberts, yea if you were able to have a look at the DGMR.__init__
and how the best way to resolve this is, then that would be super
I have just pushed the possible __init__
simplification in my fork https://github.com/agijsberts/skillful_nowcasting/tree/refactor_huggingface-hub. This already includes tests to make sure that serialization-deserialization returns an identical object for DGMR
, Discriminator
, etc. Unfortunately that introduces a pip dependency on safetensors
. In any case, after my holidays I want to check that predictions on some test data are still the same before turning this into a PR.
Perhaps someone can clarify what the reason is that DGMR
needs a custom NowcastingModelHubMixin
instead of using huggingface-hub
's PyTorchModelHubMixin
(as used by Discriminator
, Sampler
etc.). My impression is that both do the same thing (except that PyTorchModelHubMixin
writes in safetensor
format), but I'm probably missing something.
Thanks @agijsberts - have a good holiday, and I look foreward to seeing the PR
If Im honest Im not totlaly sure why NowcastingModelHubMixin
is used instead of PyTorchModelHubMixin
, but ill have to have a look at it