Fine-tuning large language models for Kaggle competition, LLM - Detect AI Generated Text
- Models
- BERT: Single & multi-GPU data parallel
- Llama-2-7b: multi-GPU model parallel
- BERT
- Single GPU
python codes/bert/single.py $num_epochs $save_every
- Multi-GPU
torchrun --standalone --nproc_per_node=gpu codes/bert/multigpu.py $num_epochs $save_every
- LLAMA 2
- Train only the final layer weight because I only have access to Titan X
python codes/llama2/finetune_classifier.py
torch
tested on v2.1.0transformers
tested on v4.35.2datasets
tested on v2.15.0
- Add Phi-2, Mistral
- Code partially adapted from PyTorch Examples and Hugging Face NLP course