/min-max-gpt

Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training

Primary LanguagePython

Issues