LarsKue/lightning-trainable

Make toy models deterministic

Closed this issue · 1 comments

Maybe it would be a good idea to give users the option to make toy models deterministic.
Unfortunately, I don't quite understand the way the distribution dataset works, but I suspect one would have to set a maximum size, sample before training and create a "fix" dataset.
a) do you think this is necessary?
b) do you think this is achievable?

The point of the distribution datasets is that they are infinite - i.e., never repeating. However, what you are describing can be achieved by something along the lines of the following:

from lightning import seed_everything
from torch.utils.data import TensorDataset

from lightning_trainable.datasets import WhateverDistributionDataset

seed_everything(0)
infinite_dataset = WhateverDistributionDataset(...)
# you can also sample in batches and save to disk first
samples = infinite_dataset.distribution.sample((sample_size,))
finite_dataset = TensorDataset(samples)