image_features.py output shape?
bpiv400 opened this issue · 3 comments
Hi! This is probably a gap in my understanding of pyTorch or h5py, but I wanted to bring it your attention just in case it’s not.
The output of image_features.py is a batch_size*(num images in split) x 1024 x 14 x 14 numpy array. You assign the features associate with each image to batch_size continuous indices in a slice of the first index. I don’t understand why it’s necessary to store batch_size copies of each image’s features.
Later, when you load the data from the h5 file in the CLEVR dataloader’s getitem method in dataset.py, you index the array as if img[i] gives the features of the ith image. But based on how you initialized the h5 file, these would actually be stored [batch_size*i:batch_size(i+1)], not i.
What am I missing here?
image_features.py extract features in batch sense, that is, extract features of batch of images and inserts batch of features into hdf5 file. So it is not the batch numbered copies of the image features.
Got it. Sorry I missed that. Why do you start word embedding indices at 1, instead of at 0?
For zero pad question sequences. I have used packed sequences in this case, though.