Questions regarding Double Argsorting.

Question

Questions regarding Double Argsorting.

LeoXinhaoLee opened this issue 3 years ago · 2 comments

Hi, thank you for the awesome code release! However, I'm a little confused about the usage of argsort() in the EpisodicDataset class.

When sampling a task in __getitem__(self, idx), you conduct the following operations:

ordered_argindices = np.argsort(indices)
ordered_indices = np.sort(indices)
_images = self.sample_images(ordered_indices)
images = torch.stack([self.transforms(_images[i]) for i in np.argsort(ordered_argindices)])
targets = np.zeros([nclasses * k], dtype=int)
targets[ordered_argindices] = self.labels[ordered_indices, ...].ravel()

It seems to me that you essentially use indices[indices.argsort()][indices.argsort().argsort()] for indexing images and labels. However, I think this generated sequence is essentially the same as indices, which is already in the correct order for one task. Thus, I'm wondering the reason for such an operation.

Thank you very much for your time and help!

Answer 1 · 2021-12-13T14:04:12.000Z

Hi! Yes, I sort, read, and then unsort. This comes from a time I was reading from hdf5 which only accepted reading data in order. Right now it has virtually no effect and it might slow down the dataloader slightly. Would you mind fixing it and doing a pull request? Otherwise I'll do it in the following weeks.

Thanks for noticing by the way :)

Answer 2 · 2021-12-14T03:44:38.000Z

Thanks for dispelling my doubt! Sure, I will do it.