LabeliaLabs/distributed-learning-contributivity

make the library label agnostic

arthurPignet opened this issue · 2 comments

Currently most of the functions work with the assumptions that the labels are one-hot-encoded vectors.

Besides the fact that it is not responsive, sometime we need to play with label index. (first label is indexed 1, second label 2, and so on)
A solution can be to add (automatically) at the dataset generation a dict of label, where the keys would be integer and the values would be the vectors for instance.

An instance with MNIST :

dataset.dic_label = { 0: [1,0,0,0,0,0,0,0,0,0],
1 : [0,1,0,0,0,0,0,0,0,0],
2 : [0,0,1,0,0,0,0,0,0,0],
3 : [0,0,0,1,0,0,0,0,0,0],
4 : [0,0,0,0,1,0,0,0,0,0],
5 : [0,0,0,0,0,1,0,0,0,0],
6 : [0,0,0,0,0,0,1,0,0,0],
7 : [0,0,0,0,0,0,0,1,0,0],
8 : [0,0,0,0,0,0,0,0,1,0],
9 : [0,0,0,0,0,0,0,0,0,1]}

bowni commented

Is this still relevant @arthurPignet ?

Yes it is. The split of the data between partners is label agnostic, but it is not he case of the shuffling/corruption
Basically, the only type of labels accepted is one-hot.