KellerJordan/modded-nanogpt

Add optimizer and training steps

Closed this issue · 1 comments

Hi, not everyone has massive GPU. So can you tell me or modify your code so that we can save the model dict and optimizer and let it run after every checkpoints. I have just 12GB GPU so I kinda want to train this overnight or so. Can you just guide mee if possible.