ViCCo-Group/thingsvision

Making `resize_dim` and `crop_dim` flexible arguments

Alxmrphi opened this issue · 7 comments

At this point, we should probably think about making resize_dim and
crop_dim flexible arguments that a user can change if they want to. Defaults can stay the same. I have never seen this error before but it seems to be necessary for some models. @Alxmrphi, could you open this response as a new issue?

Originally posted by @LukasMut in #126 (comment)


Issue: add in flexible arguments the user can specify that overrides the default transformations in order to accommodate image size inputs that are outside the standard dimensions.

I take it we'd be taking advantage of the already-present preprocess argument (that I can't see is actually used practically in the code, though after skimming the code). That's what you're thinking, right?

Yes, either by leveraging the existing preprocess argument or by allowing resize_dim and crop_dim to be passed as argument to the base extractor class, and, hence, to every class that inherits from base. Then, they'd be arguments of the class rather than the transformation function (self.resize_dim and self.crop_dim rather than resize_dim and crop_dim respectively). It think the latter would be the cleaner option.

torchvision is not playing nice at all on my new computer, so I'm unable to run any tests on thingsvision at the moment until I figure this out. However, I did have a look at the program flow and wondered something. Isn't it possible / much easier to manually specify resize_dim and crop_dim in the call to extractor.get_transformations(), which is passed to ImageDataset?

For example:

dataset = ImageDataset(
  root=root,
  out_path='path/to/features',
  backend=extractor.get_backend(),
  # transforms=extractor.get_transformations(),
  transforms=extractor.get_transformations(resize_dim=226, crop_dim=224)

These params are already in place and feed directly to the derived PyTorchExtractor and TensorFlowExtractor and are called via the get_default_transformations call. They only need to be alternatively specified in the extractor.get_transformations() for user-desired values instead. Following their path through the repo, this would lead to these values being used without any changes to the codebase.

No?

Yeh, right. That should work. Let's suggest to try this in the other thread and see whether it fixes the issue.

@Alxmrphi, do we know why we set resize_dim = crop_dim in the get_transformations method?

I don't know that, sorry. Given that we wanted to add it in, I thought it'd be the route to take to bring the user-defined values in and then when I looked at the accepted inputs to the function, those values were already present and made their way into the default transformation. Perhaps you had previously expected this behaviour and thought ahead by including them, should the user ever want to override the values?

It seems as if I did but forgot about it lol. This is especially important for custom models!