Welcome to the LLM-Lora-PEFT_accumulate repository!
This repository contains implementations and experiments related to Large Language Models (LLMs) using PEFT (Parameter Efficient Fine Tuning), LORA (Low-Rank Adaptation of Large Language Models), and QLORA (Quantized LLMs with Low-Rank Adapters).
You easily add adapters on a frozen 8-bit model thus reducing the memory requirements of the optimizer states, by training a small fraction of parameters
- HF-BitsandBytes-Integration
- 🤗 PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware
- LLM.int8() and Emergent Features
- Tensorfloat-32-precision-format
- RLHF-LLM
- Finetuning Falcon LLMs More Efficiently With LoRA and Adapters by Sebastian Raschka
- Boost Fine-Tuning Performance of LLM: Optimal Architecture w/ PEFT LoRA Adapter-Tuning on Your GPU
- How to finetune your own Alpaca 7B
- PEFT: Parameter Efficient Fine Tuning
- LORA: Low-Rank Adaptation of Large Language Models
- QLORA: Quantized LLMs with Low-Rank Adapters
- LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
- SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Go to LLM Analysis with SWOT for more clarification.