Data augmentation is not utilized fully in class `PartitionedCIFAR10`.
liyipeng00 opened this issue · 2 comments
Suject: Data augmentation (e.g., RandomCrop(), RandomHorizontalFlip()) is not utilized fully in class PartitionedCIFAR10.
Details: We note that FedLab construct the local datasets with BaseDataset(), where the sample is taken out from the local datasets without using transforms (data augmentation) (see code 1 and code 2), even if transforms are used when preparing local datasets. Consider that the images transformed randomly from a single image as multiple images. There are 60000 different augmentated images (e.g., MNIST) in total in FedLab. In contrast, if the transform is executed everytime a sample is used to train (common way), the augmentated images will be more than 60000. As a result, the test accuracy of FedLab will be lower than that of the common way, while the speed of FedLab will be faster. See the comparison results in construct-local-datasets, where we construct local datasets according to the common way (Way 1) and FedLab (i.e., Way 2). Note that if your transform is only transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean,std)])
, FedLab (still faster) and the common way can achieve the same test accuracy.
# code 1
for id, indices in self.data_indices.items():
data, label = [], []
for idx in indices:
x, y = samples[idx], labels[idx]
data.append(x)
label.append(y)
dataset = BaseDataset(data, label)
torch.save(
dataset,
os.path.join(self.path, "train", "data{}.pkl".format(id)))
# code 2
class BaseDataset(Dataset):
def __getitem__(self, index):
return self.x[index], self.y[index]
Sorry for the late reply. PartitionedCIFAR10 is a simple example of feddataset that I provided. For the general case, I guess PartitionedCifar (provided by @AgentDS ) is the way you suggested.
Thanks for your kind reply. Yes, this example is what I suggest. I am sorry that I have not noticed it before I raise this issue. Fortunately, I think this issue could be marked to help us understand FedLab better. Thanks again.