Curricular Learning

Fork of nanoGPT to experiment on with Curricular Learning

Install

pip install torch numpy transformers datasets tiktoken wandb tqdm

Dependencies:

pytorch <3
numpy <3
transformers for huggingface transformers <3
datasets for huggingface datasets <3 (if you want to download + preprocess Wikipedia dataset)
tiktoken for OpenAI's fast BPE code <3
wandb for optional logging <3
tqdm for progress bars <3

$ python data/sorted_wikipedia_dataset/prepare.py

This creates a 5x Shards and val.bin in that data directory.

Now rename current shrad to be trained on to train.bin and remember to chang num_ters or else it will not run.

$ python train.py config/train_wikipedia_shards.py