Move away from configs, to just passing args to the class
SeanNaren opened this issue ยท 0 comments
SeanNaren commented
๐ Feature
Currently, the process of creating a module looks like such:
...
dm = TextClassificationDataModule(
cfg=TextClassificationDataConfig(
batch_size=1,
dataset_name="emotion",
max_length=512,
),
tokenizer=tokenizer,
)
This is fine for the "general" cases, however for the specific modules for datasets, it looks cumbersome:
...
dm = WMT16TranslationDataModule( # I have to choose the WMT16 class
cfg=TranslationDataConfig(
dataset_name="wmt16", # I have to pass the name of the dataset as well? why is this not the default?
...
),
tokenizer=tokenizer,
)
A solution would be to introduce a config class like below, however this adds even more lines.
...
@dataclass
class WMT16TranslationDataConfig:
dataset_name: str = "wmt16"
I suggest we opt to remove configs entirely, and just rely on the modules:
dm = TextClassificationDataModule(
batch_size=1,
dataset_name="emotion",
max_length=512,
tokenizer=tokenizer,
)
dm = WMT16TranslationDataModule(tokenizer=tokenizer) # dataset_name="wmt16" is default!
From my understanding, this is supported by hydra, but all cfg
objects will be converted into parameters.