/colossalai-training-code

Our old fine-tuning code based on ColossalAI.

Primary LanguagePythonApache License 2.0Apache-2.0

colossalai-training-code

This is the training code we used for the first prototype models.

Notably, it's based on an old version of HuggingFace's run_clm.py example, which was then adapted by the ColossalAI developers to make use of some their optimizations. It was then slightly improved to be usable in real-world scenarios (Tensorboard support, proper checkpointing, etc.).

Usage

This is being committed for archiving purposes, but if you'd like to use it, it probably works. The TL;DR version is:

  • Get all the dependencies installed.
    • I have not documented this properly, but installing transformers and colossalai should probably cover it.
  • Put your data in a file called ./data/train.json.
    • It should be a file where each line is a JSON object containing a text field, which contains the actual text that will be tokenized and fed into the model in the training loop.
  • Adjust any relevant config parameters in finetune.bash and run it. If you're lucky, the training loop will eventually start!
    • Metrics should be logged to a runs folder inside the OUTPUT_DIR you've specified, so you can host a Tensorboard server there to watch them.
  • When it's done, you'll probably want to get a proper HF model out of the training checkpoints. You can do that using the provided convert_to_hf.py utility script.