SMILELab-FL/FedLab

VisionPartitioner such as CIFAR10Partitioner and MNISTPartitioner gets stuck when partitioning non-iid data with dir_alpha si small(for eg 0.1 or 0.3)

williamium3000 opened this issue · 3 comments

Describe the bug
CIFAR10Partitioner gets stuck when partitioning non-iid data with dir_alpha si small(for eg 0.1 or 0.3)

Environment
Environment you use when bug appears:

  1. Python version:3.8.5

  2. PyTorch Version:1.9.0+cu111

  3. FedLab version:1.1.4

  4. code you run
    CIFAR10Partitioner(targets=targets, num_clients=100,
    balance=True, partition="dirichlet",
    unbalance_sgm=0,
    num_shards=None,
    dir_alpha=0.1)

  5. the detailed error
    It gets stuck when there is still remaining data. For example, a typical error log looks like:
    ....
    Remaining Data: 1340
    Remaining Data: 1339
    Remaining Data: 1338
    Remaining Data: 1337
    Remaining Data: 1336
    Remaining Data: 1336
    Remaining Data: 1335
    Remaining Data: 1334
    Remaining Data: 1333
    Remaining Data: 1332
    Remaining Data: 1331
    Remaining Data: 1330
    Remaining Data: 1329
    Remaining Data: 1329
    Remaining Data: 1328
    Remaining Data: 1328
    Remaining Data: 1327
    Remaining Data: 1327
    Remaining Data: 1326
    Remaining Data: 1325
    Remaining Data: 1324
    Remaining Data: 1324
    Remaining Data: 1324
    Remaining Data: 1323
    Remaining Data: 1323
    Remaining Data: 1323
    Remaining Data: 1322
    Remaining Data: 1322
    Remaining Data: 1321
    Remaining Data: 1320
    Remaining Data: 1320
    Remaining Data: 1319
    Remaining Data: 1318
    Remaining Data: 1318
    Remaining Data: 1318
    Remaining Data: 1317
    Remaining Data: 1316
    Remaining Data: 1316
    Remaining Data: 1315
    Remaining Data: 1314
    Remaining Data: 1314
    Remaining Data: 1314
    Remaining Data: 1314
    Remaining Data: 1314
    Remaining Data: 1313
    Remaining Data: 1313

And it gets stuck here.

I'm encountering the same issue. Unfortunately the answer has been already given in issue #230 .
If it can help you I was able to generate the partition fairly quickly with balance=True, alpha=0.3 and seed was 119 or 120.
I would be grateful in case you find some seed that works for other values of alpha, if you post them, thanks :)

Thanks for your attention.
As @GiuseppeGalilei has mentioned, the answer has been given in issue #230. You may encounter this issue with certain seeds. The solution is to change your random seed.

Thanks for the commet and reply. I have succeeded in partioning with balance=True, alpha=0.1 and 0.3 using seed of 260.