hassony2/kinetics_i3d_pytorch

preprocess of input frame is different from original implementation.

Closed this issue · 1 comments

Hi, @hassony2
Thank you for sharing the pytorch-i3d implementation.
I find the preprocess of input frame is different from the original implementaion.
In I3D-tensorflow version, they state that "For RGB, ..... . Pixel values are then rescaled between -1 and 1. "
However, in i3d_pt_profiling.py, I only see process like
dataset = datasets.ImageFolder(dataset_path, transforms.Compose([ transforms.CenterCrop(args.im_size), transforms.ToTensor(), normalize, ]))

But in the other implementations (https://github.com/piergiaj/pytorch-i3d), they use
img = (img/255.)*2 - 1

I am wondering whether this inconsistent will lead to a drop of performance.
Thanks

Hi @hzhang57,

The profiling code from i3d_pt_profiling was just meant for profiling purposes :). Therefore I feed in dummy data and don't worry about the output values.
If you want the proper demo you should look at i3d_pt_demo where you can verify that indeed the input is scaled between [-1, 1].
For proper results you should indeed rescale the inputs.

Hope this helps!

All the best,

Yana