sovit-123/llm_low_mem_exps

Random experiments for LLM training/inference with low GPU usage.

Jupyter Notebook

README

An experimental repository for training/inference LLMs on low with low GPU memory usage.

Requirement Notes

Need to install Flash Attention 2 for low memory training, whether or not using Unsloth or not. For Flash Attention 2, CUDA Toolkit needs to be installed globally. But no CuDNN is needed globally.