Mod add seed issue
nix-apollo opened this issue · 2 comments
From @danbraunai-apollo:
This is also an issue/bug for ModularArithmetic. We have an even worse setup there with this code in create_modular_arithmetic_dataset:
modulus = cfg["dataset"]["modulus"]
fn_name = cfg["dataset"]["fn_name"]
frac_train = cfg["dataset"]["frac_train"]
seed = cfg["seed"]
modulus = dataset_config.modulus or modulus
fn_name = dataset_config.fn_name or fn_name
frac_train = dataset_config.frac_train if dataset_config.frac_train is not None else frac_train
seed = dataset_config.seed if dataset_config.seed is not None else seed
This looks for a main config seed in the saved model (as opposed to looking for seed in the dataset config), and replaces it with a seed in the dataset config.
I'm confused by what the problem is, @danbraunai-apollo should clarify.
I think the code above should have the line
+ dataset_seed = cfg["dataset"]["seed"]
- seed = cfg["seed"]
and then passing dataset_seed instead of seed where used in the future.
Also, since this value can be None, we have to handle passing seed=None to train_test_split in create_modular_arithmetic_dataset
. The best thing to do IMO is to have a thing at the start of all modular arithmetic scripts which does config.dataset.seed = config.seed
if config.dataset.seed is None
.
The way the code current works is "let's use config.dataset.seed to manage the dataset when training the model. But if loading a trained model, let's use config.seed."