microsoft/promptbench

For the few shot setting, how many are used exactly?

zhimin-z opened this issue · 5 comments

It seems there is a discrepancy between the paper and the code in terms of shot configuration.

Hi zhimin, in our experiments, we randomly select three examples in the training set of a task and append them to a prompt. You can refer to Sec 2.1 for details.

Hi zhimin, in our experiments, we randomly select three examples in the training set of a task and append them to a prompt. You can refer to Sec 2.1 for details.

Thanks for your quick reply. That is 3-shot, however, what I found from your script seems to be 10-shot:

# Please generate 10 similar prompts. the prompt is used for MMLU (Measuring Massive Multitask Language Understanding) dataset.
and

Any idea?

The link you provide are 10 candidate prompts for each dataset, as you can see there is no 'examples' in these prompts :)

You can find 3-shot examples in this file.

The link you provide are 10 candidate prompts for each dataset, as you can see there is no 'examples' in these prompts :)

You can find 3-shot examples in this file.

Yeah, I found this file as well. But I cannot find anything related to MMLU. Does MMLU evaluation use this file actually?

Thank you for highlighting this issue.

There are two ways:

  1. You can directly download the training set for MMLU from Hugging Face and select three examples for your use.
  2. In the former version (before Oct. 1), the few-shots examples are stored in data/MMLU_fewshot. You can download it here. The code to fetch examples is:
    with open("data/MMLU_few_shot.json", "r") as file:
        self.few_shot_data = json.load(file)  

    def get_few_shot_examples(self, task):
        content = "Here are three examples.\n"
        data = self.few_shot_data[task]
        for idx in range(min(len(data), 3)):
            content +=  ("Input: " + data[idx]["input"] + "\n" \
                        + "A : " + data[idx]["A"] + "\n" \
                        + "B : " + data[idx]["B"] + "\n" \
                        + "C : " + data[idx]["C"] + "\n" \
                        + "D : " + data[idx]["D"] + "\n\n" \
                        + "Answer : " + data[idx]["target"] + "\n" \
                        )

        return content