/out-of-the-box-fp8-training

Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.

Primary LanguageJupyter NotebookMIT LicenseMIT

Out-of-the-box FP8 Training

Repo for the out-of-the-box FP8 training notebook.

This provides a demo of using the unit_scaling library, showing how a model can be easily adapted to train in FP8.

Using unit scaling

📝 View the notebook: out_of_the_box_fp8_training.ipynb

💻 Try the library: graphcore-research.github.io/unit-scaling

📖 Read the paper: arxiv.org/abs/2303.11257

Running the notebook

Note that the notebook & library work cross-platform, so can also be run on CPU/GPU.

License

Copyright (c) 2023 Graphcore Ltd. Licensed under the MIT License.

This repository includes modified code from nanoGPT, Copyright (c) 2022 Andrej Karpathy (train.py). It also includes nanoGPT as a submodule.