msamogh/nonechucks

Failed to shuffle data across epoches

chadHGY opened this issue · 1 comments

Hi, I found this project quite helpful and implement into my project. However, it seems something wrong with the shuffling method of the loader. Here is my testing code:

import torch
import torch.utils.data as Data
import nonechucks as nc

torch.manual_seed(1)    # reproducible
BATCH_SIZE = 5
SHUFFLE = True
NUMWORKER = 2

x = torch.linspace(1, 10, 10)       # x data (torch tensor)
y = torch.linspace(10, 1, 10)       # y data (torch tensor)

torch_dataset = Data.TensorDataset(x,y)
loader = Data.DataLoader(
    dataset=torch_dataset,      # torch TensorDataset format
    batch_size=BATCH_SIZE,      # mini batch size
    shuffle=SHUFFLE,              
    num_workers=NUMWORKER,
    )

loader_safe = nc.SafeDataLoader(nc.SafeDataset(torch_dataset), batch_size=BATCH_SIZE, shuffle=SHUFFLE, num_workers=NUMWORKER)

print('\nNormal dataloader')
for epoch in range(3):
    for step, (batch_x, batch_y) in enumerate(loader):
        print('Epoch: ', epoch, '| Step: ', step, '| batch x: ',
              batch_x.numpy(), '| batch y: ', batch_y.numpy())
        
print('\nSafe dataloader')
for epoch in range(3):
    for step, (batch_x, batch_y) in enumerate(loader_safe):
        print('Epoch: ', epoch, '| Step: ', step, '| batch x: ',
              batch_x.numpy(), '| batch y: ', batch_y.numpy())

And the results are like this:

Normal dataloader
Epoch:  0 | Step:  0 | batch x:  [10.  3.  1.  6.  5.] | batch y:  [ 1.  8. 10.  5.  6.]
Epoch:  0 | Step:  1 | batch x:  [8. 4. 2. 9. 7.] | batch y:  [3. 7. 9. 2. 4.]
Epoch:  1 | Step:  0 | batch x:  [7. 9. 6. 2. 4.] | batch y:  [4. 2. 5. 9. fe  6. 10.  1.]
Epoch:  2 | Step:  0 | batch x:  [4. 9. 1. 8. 5.] | batch y:  [ 7.  2. 10.  3.  6.]
Epoch:  2 | Step:  1 | batch x:  [ 3.  2.  6. 10.  7.] | batch y:  [8. 9. 5. 1. 4.]

Safe dataloader
Epoch:  0 | Step:  0 | batch x:  [6. 7. 2. 3. 1.] | batch y:  [ 5.  4.  9.  8. 10.]
Epoch:  0 | Step:  1 | batch x:  [ 9. 10.  4.  8.  5.] | batch y:  [2. 1. 7. 3. 6.]
Epoch:  1 | Step:  0 | batch x:  [6. 7. 2. 3. 1.] | batch y:  [ 5.  4.  9.  8. 10.]
Epoch:  1 | Step:  1 | batch x:  [ 9. 10.  4.  8.  5.] | batch y:  [2. 1. 7. 3. 6.]
Epoch:  2 | Step:  0 | batch x:  [6. 7. 2. 3. 1.] | batch y:  [ 5.  4.  9.  8. 10.]
Epoch:  2 | Step:  1 | batch x:  [ 9. 10.  4.  8.  5.] | batch y:  [2. 1. 7. 3. 6.]

It seems that the safeloader doesn't shuffle correctly across epoches. Do I somehow miss use the package?

By the way, the pytorch version I used is 1.0.1.post2

Fixed 35ed275