JJGO/shrinkbench

Datasets Arguments

MohammedAljahdali opened this issue · 2 comments

Hey,
In the dataset creation function the path argument is always ignored, and only the train argument is provided when creating a dataset, so in the dataset_path function the path will always be none, so it expect that the DATAPATH env variable exists.

Another issue is when using dataset that you don't have, there is no way to pass the download and root arguments to the dataset constructor.

For now, I did some workaround on the code I have cloned, but I was wondering if this intended behavior or a bug. If it is a bug I could create a pull request.

JJGO commented

Hi, thanks for taking a look at this and offering to PR.

This is not really a bug, but a consequence of how I built this. My workflow was to:

  1. Test a new dataset and download it if necessary interactively using the constructor directly and providing the path argument.
  2. Once the dataset was working, I'd make it sure that DATAPATH include it since I'd create the Experiment objects non interactively using scripts and having a DATAPATH made working with multiple machines much easier.

Since then, I have refined this somewhat using multiple inheritance and a Mixin class that makes use of the DATAPATH resolution.

However, I agree that the main issue of argument forwarding to the dataset (and dataloaders) is still there. Forwarding path is not enough as other datasets might not accept that argument. I believe a cleaner solution is something like this that doesn't assume anything from the dataset kwargs beyond a boolean train parameter.

If you think you can backport that to this project, feel free to submit a PR

JJGO commented

Closing due to inactivity