Repo for the out-of-the-box FP8 training notebook.
This provides a demo of using the unit_scaling
library, showing how a model can be easily adapted to train in FP8.
📝 View the notebook: out_of_the_box_fp8_training.ipynb
💻 Try the library: graphcore-research.github.io/unit-scaling
📖 Read the paper: arxiv.org/abs/2303.11257
Note that the notebook & library work cross-platform, so can also be run on CPU/GPU.
Copyright (c) 2023 Graphcore Ltd. Licensed under the MIT License.
This repository includes modified code from nanoGPT, Copyright (c) 2022 Andrej Karpathy (train.py
). It also includes nanoGPT as a submodule.