microsoft/promptbench

Numerous datasets exhibit a label of -1

camilochs opened this issue · 1 comments

Hello,

Attempting to load multiple datasets, I observed that several of them have all their elements labeled as -1.

For example:

import promptbench as pb
from tqdm import tqdm
dataset = pb.DatasetLoader.load_dataset("sst2")

print([e for e in tqdm(dataset) if e["label"] != -1])

Results: []
I'm using the promptbench version 0.0.2.

Thanks!

Hi,

Thank you for bringing this issue to our attention!

This error occurred because we changed the validation set to the test set, but the ground truth labels in the test set are not provided.

It may require some time to fix this for pip install. To temporarily resolve this, you could modify the code in line 241 of promptbench/dataload/dataset.py to data = load_dataset("glue", task)["validation"].

Thank you again!