Let's reproduce GPT-2 (124M)
Primary LanguageJupyter Notebook
torchrun --standalone --nproc_per_node=$RUNPOD_GPU_COUNT train_gpt2.py