/llm_low_mem_exps

Random experiments for LLM training/inference with low GPU usage.

Primary LanguageJupyter Notebook

README

  • An experimental repository for training/inference LLMs on low with low GPU memory usage.

Requirement Notes

  • Need to install Flash Attention 2 for low memory training, whether or not using Unsloth or not. For Flash Attention 2, CUDA Toolkit needs to be installed globally. But no CuDNN is needed globally.