Implementation of LoRA: Low-Rank Adaptation of Large Language Models.
This project is a part of TF06 Course from ProtonX. We use LoRA technique to improve training Large Language Model.
We use Bloomz-1b1 to fine tuning on English - Vietnamese datasets.
Give us a star if this repo helpful to you.
Slide about LoRA Explain (by Nguyen Bui Ngoc Han):
You just download the .ipybn file and run it on Google Colab or on your Jupyter Notebook.
Live demo (Click icon below to run in Colab):
Step 1: Load your model.
For example you have model like this:
from transformers import AutoModelForCausalLM from transformers import AutoTokenizer modelName = "bigscience/bloomz-1b1" # Or whatever you want in HuggingFace model = AutoModelForCausalLM.from_pretrained(modelName).to(device) tokenizer = AutoTokenizer.from_pretrained(modelName)
The device is your hardware support. You can set it automatically with this code:
import torch device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
Step 2: Prepare dataset for Training.
For example you want to make a text-generating model for question-anwsering task, you will need a dataset that it have list of questions and anwsers. You can try this dataset for practice:
Get dataset from source:
Load dataset as HuggingFace Dataset type:
from datasets import load_dataset from datasets import Dataset data = load_dataset('json', data_files='Ecommerce_FAQ_Chatbot_dataset.json') ds = Dataset.from_list(data["train"]["questions"][0])
Merge question and answer columns into one called prediction:
def merge_columns(example): example["prediction"] = example["question"] + " ->: " + str(example["answer"]) return example # Map merge_columns function to dataset ds =
Tokenizer prediction column:
# Tokenizer/Véc tơ hóa văn bản (Chuyển văn bản thành số để training) def tokeni(example): example["prediction_token"] = tokenizer(example["prediction"], return_tensors='pt', padding=True)['input_ids'] return example # Map tokeni function to dataset ds =,batched=True)
Step 3: Add LoraConfig Adapter to model
# Set config for LoRA from peft import LoraConfig, get_peft_model config = LoraConfig( r=16, #attention heads lora_alpha=16, #alpha scaling lora_dropout=0.05, bias="none", task_type="CAUSAL_LM" # set this for CLM or Seq2Seq ) # Set peft adapter to model model_lora = get_peft_model(model, config)
There are some explain arguments for this code:
: Lora attention dimension (int).lora_alpha
: The alpha parameter for Lora scaling.lora_dropout
: The dropout probability for Lora layers.bias
: Bias type for Lora. Can be 'none', 'all' or 'lora_only'task_type
: Task you want to run
Step 4: Training model
# Training model import transformers from transformers import Trainer,EarlyStoppingCallback class CustomTrainer(Trainer): def compute_loss(self, model, inputs, return_outputs=False): outputs = model(**inputs) #Perplexity perplexity = torch.exp(outputs.loss) return (perplexity, outputs) if return_outputs else perplexity trainer = CustomTrainer( model=model, train_dataset=ds_tt["train"]["prediction"], eval_dataset=ds_tt["test"]["prediction"], args=transformers.TrainingArguments( per_device_train_batch_size=3, # batch size num_train_epochs=1, # epochs gradient_accumulation_steps=1, warmup_steps=100, save_total_limit=5, learning_rate=2e-4, fp16=True, output_dir='outputs', logging_steps=500, evaluation_strategy="steps", load_best_model_at_end = True ), data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False), callbacks=[EarlyStoppingCallback(early_stopping_patience = 4)] ) model.config.use_cache = True # silence the warnings. Please re-enable for inference! trainer.train()
When finish training task you can show the loss curve of train and validation:
trainingEpoch_loss_adam,validationEpoch_loss_adam=[],[] t = 0 for i in trainer.state.log_history[:-1]: if t == 0: trainingEpoch_loss_adam.append(i["loss"]) t=1 else: validationEpoch_loss_adam.append(i["eval_loss"]) t=0 from matplotlib import pyplot as plt plt.plot(trainingEpoch_loss_adam, label='train_loss') plt.plot(validationEpoch_loss_adam,label='val_loss') plt.legend()
Example result:
You can generate text from model like this:
question = "How can I create an account?"
prompt = question+" ->: "
inputs = tokenizer( question, return_tensors="pt")
with torch.autocast(device.type):
outputs = model.generate(input_ids=inputs["input_ids"].to(device), max_new_tokens=100)
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0])
Example Result:
How can I create an account? ->: Click the "Create an account" button. Enter your email address and password. Click the "Continue" button.
In this project we use datasets from 4 sources:
- Kaggle Ecommerce FAQ Chatbot Dataset (English)
- Kaggle Ecommerce FAQ Chatbot Dataset (Vietnamese)
- NLLB_translations_Vietnamese_51k
Model result:
- NLLB + viquad Dataset (Vietnamese): (training_loss=2.1773)
- Ecommerce FAQ Chatbot Dataset (English): (training_loss=2.3110)
- Ecommerce FAQ Chatbot Dataset (Vietnamese): (training_loss=2.0299)
Time compare:
- Model bloomz-1b1 train data NLLB, 1 epoch (Using LoRA) (Train on V100 Colab)
- Model bloomz-1b1 train data NLLB, 1 epoch (without LoRA) (Train on V100 Colab)
Compare Table:
LoRA | Without LoRA | |
Time Training | ~157m | ~202m |
So with LoRA technique, we reduce the training time 22.2% in NLLB-57k dataset with bloomz-1b1 model.
Nguyen Thanh Phat (phatjk)
- Github:
- Linkedin:
- Email:
Nguyen Bui Ngoc Han (Nguyễn Hân)
- Github:
- Linkedin:
- Email:
Nguyen Thanh Chung (Edward Nguyen)
- Github:
- Linkedin:
- Email:
Pham Quynh Trang (Trang Pham)
Nguyen Ba Ngoc
- Github:
- Linkedin:
- Email: