karpathy/build-nanogpt

Text generation can use raw_model instead of model

sapphire008 opened this issue · 0 comments

The current script bypasses the text generation step when the model is compiled. However, if we change from model(...) to raw_model(...), we can still generate the text when the model is compiled.

build-nanogpt/train_gpt2.py

Lines 459 to 461 in 6104ab1

with torch.no_grad():
with torch.autocast(device_type=device_type, dtype=torch.bfloat16):
logits, loss = model(xgen) # (B, T, vocab_size)