In-context Learning

How to Instsall

  1. pip install -r requirements.txt to install additional libraries.

  2. Install deepspeed

How to Run (example)

See details about config in conf directory

deepspeed --include localhost:0,1,2,3 --no_local_rank distributed_main.py ds_configs=zero3 experiments=sst2 models=gpt-j seed=100

Notice!

  • Few-shot data file is responsible for Order & Label Balance of demonstrations

    • Process order & balance in label sampling stage
    • Check sample few-shot data for details
  • Experiment config manages Templates, Verbalizers, Methods, etc.

    • Templates include instructions and prompts
    • Methods include infernece method(direct or channel)
  • Deepspeed config manages experiment environments including dtype, visible gpus, zero stage, etc.

Sampling Random Dataset

Run generated_fewshot.py to randomly sample datasets. We generate train.jsonl for train set and test.jsonl for test set. (We just copy the original dataset for test set, only the formatting changes.) Datasets are saved in json format for each sample.

  • label : label of the sample
  • sentence1 : first input of the sample.
  • sentence2 : second input of the sample. For single-sentence tasks, sentence2 is not given.

Parameters

  1. task_name
  2. benchmark_name : select from glue, super_glue, tweet_eval, huggingface
    • huggingface is for tasks without any specific benchmark (e.g., trec, ag_news)
  3. output_dir
  4. seed : random seed
  5. n_samples : number of samples (= k)
  6. balance : if given, we sample equal number of samples per class

To run sample scripts : sh sample_scripts/generate_dataset.sh