BasBuller/PySNN

Incorrect NMNIST Labels

BKHMSI opened this issue · 0 comments

You are assigning the labels incorrectly.

In this function:

def _concat_dir_content(content):
    ims = []
    labels = []
    names = []
    for idx, (name, data) in enumerate(content.items()):
        if not isinstance(data, (list, tuple)):
            data = [data]
        ims += data
        labels += [idx for _ in range(len(data))]
        names += [name for _ in range(len(data))]
    df = pd.DataFrame({"sample": ims, "label": labels})
    return df, name

The labels you are passing to the DataFrame should be list(map(int, names)) for N-MNIST not the index of the current directory as the order might differ. It shouldn't make a difference in the training, but it's semantically incorrect.