mit-ll-responsible-ai/hydra-zen

ConfigDicts aren't converted to dataclasses when specified in a list

DirkKuhn opened this issue · 11 comments

The conversion back to dataclasses doesn't seem to work, when they are specified in a list.
Here is a minimal example.

from collections.abc import Sequence
from dataclasses import dataclass
from hydra_zen import store, zen, builds
from hydra_zen.typing import ZenConvert

@dataclass
class B:
    x: int

def app(dc: Sequence[B]) -> None:
    assert all(isinstance(b, B) for b in dc)

Config = builds(
    app,
    populate_full_signature=True,
    dc=[B(x=0)],
    zen_convert=ZenConvert(dataclass=True),
    hydra_convert='all'
)
store(Config, name="config")

if __name__ == "__main__":
    store.add_to_hydra_store()
    zen(app).hydra_main(config_path=None, config_name="config", version_base="1.3")

It fails with an assertion error. The entry in dc is a DictConfig.

Thanks!

rsokl commented

Thanks very much for the nice repro -- very helpful.

hydra-zen is constructing the config appropriately:

from hydra_zen import to_yaml

>>> print(to_yaml(Config))
_target_: __main__.app
_convert_: all
dc:
- _target_: __main__.B
  x: 0

The weirdness is on Hydra's end:

>>> instantiate(Config())  # correct
[B(x=0)]
>>> instantiate(Config().dc) # incorrect
[{'x': 0}]
>>> instantiate(Config().dc, _convert_="object")  # correct
[B(x=0)]

I want zen to get your use case right automatically - people using zen should not be expecting OmegaConf objects IMO.

I think I may have a fix for this.. in place of using instantiate within zen, I can use:

def inst(x):
    x = instantiate(x)
    if isinstance(x, (ListConfig, DictConfig)):
        # cant' use _convert_="object" because it is Hydra 1.3+ only
        return OmegaConf.to_object(x)
    else:
        return x

which gets your example to run without issue. I did a quick prototype and zen's tests pass after I make the expected improvements. This is promising!

I will need to think about how this can be done in a way that lets people opt into the old behavior. Other than that, I am very keen to make this improvement; zen's behavior will be more robust for those who are eager to use dataclasses throughout their configs!

kiyoon commented

I think I have the same problem. I can instantiate a PyTorch optimiser (SGD) when I have list of dict as a param.

image

image

It says I'm getting DictConfig instead of a list.

rsokl commented

@DirkKuhn @kiyoon I just pushed a pre-release of hydra-zen v0.11.0, which should have a fix to the issue reported above. When you have a moment, please install it using pip install hydra-zen --pre and let me know if it resolves your issue.

@kiyoon in the example you posted I don't see hydra_zen.zen being used anywhere. It looks like you are calling instantiate directly, so you may be facing a different issue, and one that is pertaining more directly to Hydra.

kiyoon commented

@rsokl Oh sorry for not being clear. I'm not using zen but I'm using either build or store, and putting the built function into hydra store. And I'm calling the @hydra.main decorator in the main function. Instantiation works for other models and everything, except this one. I forgot what zen is, and maybe that's a better way to do it. I'll try the pre-release

rsokl commented

@kiyoon that is ultimately an issue with how Hydra's instantiate works, not with how builds/store create configs. If you are using Hydra 1.3.0+ you can pass the option _convert_='object' to `instantiate, and I believe that will solve your issue: https://github.com/facebookresearch/hydra/blob/main/NEWS.md#130-2022-12-08

kiyoon commented

@rsokl Thanks a lot. The _convert_ did the trick without having to upgrade to the 0.11 version. I still don't understand how it works exactly under the hood, but good to know that there's an option to resolve this easily.

rsokl commented

I recommend reading this if you are interested in understanding what is going on under the hood.

Glad this resolved the issue!

kiyoon commented

Thanks for sharing the link. I think it makes sense a little better. The problem with me is that I came from no hydra/OmegaConf background, so at first I found it hard to understand which functionality is hydra and which is hydra-zen.

Can I ask one more question if you don't mind? Why would I ever need to remove _convert_ = "object" if it solves most of the conversion problem? It will still instantiate the _target_ if that is defined, so I see why not use this for every instantiation.

rsokl commented

I found it hard to understand which functionality is hydra and which is hydra-zen.

Yeah.. this is a real problem. I would love to add a page to the Explanations section of hydra-zen's docs that give a high-level overview of what omegaconf, hydra, and hydra-zen are responsible for, respectively. Your confusion is totally understandable.

Why would I ever need to remove convert = "object"?

That is a good question... I don't think that you would! You can probably safely use that option everywhere. Unfortunately, I cannot do that by default in hydra-zen, because it would 1) Require that I end support for Hydra < 1.3 and 2) Be a major compatibility-breaking change.

That all being said, I would like to make it much easier for people to be able to opt-in to this behavior uniformly across hydra-zen's API.

kiyoon commented

I see. Just wanted to clarify if I'm not doing something wrong by putting that everywhere. Fair enough that it can't be a default as of now.

Thanks a lot for your work on this project. The documentation is very nice and well done, and I also found the presentation about it on YouTube helpful.

@rsokl It seems to work now! Thanks so much for your effort!
Should I close the issue?