LoRA Experiment

Implementation of LoRA: Low-Rank Adaptation of Large Language Models.

This project is a part of TF06 Course from ProtonX. We use LoRA techique to improve training Large Language Model.

We use Bloomz-1b1 to fine tuning on English - Vietnamese datasets.

Give us a star if this repo helpful to you.

Slide about LoRA Explain (by Nguyen Bui Ngoc Han):

I. How to run our pretrained model?

You just download the .ipybn file and run it on Google Colab or on your Jupyter Notebook.

Live demo (Click icon below to run in Colab):

II. How to add LoRA to finetuining your own model?

Step 1: Load your model.

For example you have model like this:

from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer
modelName = "bigscience/bloomz-1b1" # Or whatever you want in HuggingFace
model = AutoModelForCausalLM.from_pretrained(modelName).to(device)
tokenizer = AutoTokenizer.from_pretrained(modelName)

The device is your hardware support. You can set it automatically with this code:

import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

Step 2: Prepare dataset for Training.

For example you want to make a text-generating model for question-anwsering task, you will need a dataset that it have list of questions and anwsers. You can try this dataset for practice:

Kaggle Ecommerce FAQ Chatbot Dataset

Get dataset from source:

  !wget https://raw.githubusercontent.com/phatjkk/data/main/LLM/Ecommerce_FAQ_Chatbot_dataset.json

Load dataset as HuggingFace Dataset type:

  from datasets import load_dataset
  from datasets import Dataset
  data = load_dataset('json', data_files='Ecommerce_FAQ_Chatbot_dataset.json')
  ds = Dataset.from_list(data["train"]["questions"][0])

Merge question and answer columns into one call prediction:

  def merge_columns(example):
      example["prediction"] = example["question"] + " ->: " + str(example["answer"])
      return example
  # Map merge_columns function to dataset
  ds = ds.map(merge_columns)

Tokenizer prediction column:

  # Tokenizer/Véc tơ hóa văn bản (Chuyển văn bản thành số để training)
  def tokeni(example):
      example["prediction_token"] = tokenizer(example["prediction"], return_tensors='pt', padding=True)['input_ids']
      return example
  # Map tokeni function to dataset
  ds = ds.map(tokeni,batched=True)

Step 3: Add LoraConfig Adapter to model

  # Set config for LoRA 
  from peft import LoraConfig, get_peft_model
  config = LoraConfig(
      r=16, #attention heads
      lora_alpha=16, #alpha scaling
      lora_dropout=0.05,
      bias="none",
      task_type="CAUSAL_LM" # set this for CLM or Seq2Seq
  )
  # Set peft adapter to model
  model_lora = get_peft_model(model, config)

There are some explain arguments for this code:

r: Lora attention dimension (int).
lora_alpha: The alpha parameter for Lora scaling.
lora_dropout: The dropout probability for Lora layers.
bias: Bias type for Lora. Can be 'none', 'all' or 'lora_only'
task_type: Task you want to run

Step 4: Training model

  # Training model
  import transformers
  from transformers import Trainer,EarlyStoppingCallback
  
  class CustomTrainer(Trainer):
      def compute_loss(self, model, inputs, return_outputs=False):
          outputs = model(**inputs)
          #Perplexity
          perplexity = torch.exp(outputs.loss)
          return (perplexity, outputs) if return_outputs else perplexity
  trainer = CustomTrainer(
      model=model,
      train_dataset=ds_tt["train"]["prediction"],
      eval_dataset=ds_tt["test"]["prediction"],
      args=transformers.TrainingArguments(
          per_device_train_batch_size=3, # batch size
          num_train_epochs=1, # epochs
          gradient_accumulation_steps=1,
          warmup_steps=100,
          save_total_limit=5,
          learning_rate=2e-4,
          fp16=True,
          output_dir='outputs',
          logging_steps=500,
          evaluation_strategy="steps",
          load_best_model_at_end = True
      ),
      data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
      callbacks=[EarlyStoppingCallback(early_stopping_patience = 4)]
  )
  model.config.use_cache = True  # silence the warnings. Please re-enable for inference!
  trainer.train()

When finish training task you can show the loss curve of train and validation:

 trainingEpoch_loss_adam,validationEpoch_loss_adam=[],[]
 t = 0
 for i in trainer.state.log_history[:-1]:
   if t == 0:
     trainingEpoch_loss_adam.append(i["loss"])
     t=1
   else:
     validationEpoch_loss_adam.append(i["eval_loss"])
     t=0
 from matplotlib import pyplot as plt
 plt.plot(trainingEpoch_loss_adam, label='train_loss')
 plt.plot(validationEpoch_loss_adam,label='val_loss')
 plt.legend()
 plt.show

Example result:

- Step 5: Test generate task

You can gennerate text from model like this:

  question = "How can I create an account?"
  prompt = question+" ->: "
  inputs = tokenizer( question, return_tensors="pt")
  with torch.autocast(device.type):
      outputs = model.generate(input_ids=inputs["input_ids"].to(device), max_new_tokens=100)
      print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0])

Example Result:

How can I create an account? ->:  Click the "Create an account" button. Enter your email address and password. Click the "Continue" button.

III. About datasets

In this project we use data set from 3 source:

IV. Result and Comparision

Model result:

- NLLB + viquad Dataset (Vietnamese): (training_loss=2.1773)
- Ecommerce FAQ Chatbot Dataset (English): (training_loss=2.3110)
- Ecommerce FAQ Chatbot Dataset (Vietnamese): (training_loss=2.0299)

Time compare: