/CustomLLMFinetuningHandbook

an example to fine-tuning a Language Learning Model (LLM) from data preparation to deployment.

Primary LanguagePython

Custom LLM Finetuning Handbook

Welcome to the Custom LLM Finetuning Handbook, an example to fine-tuning a Language Learning Model (LLM) from data preparation to deployment. This handbook covers various aspects including data preparation, supervised fine-tuning (SFT), direct preference optimization (DPO), evaluation on common benchmark, and deployment using vLLM with an intuitive user interface.

We have uploaded the trained checkpoint:

Data Preparation

Data preparation is the most crucial part. Here are the steps involved:

  • Instruct Mining: This involves extracting and refining instructions from various sources to create a structured dataset for training.
  • OpenAI API: Utilize the OpenAI API for additional data collection and preprocessing tasks.
  • Hust Clean: A process to clean and standardize custom domain data

The collecting data is placed under https://huggingface.co/datasets/lu-vae/cciip-gpt-dataset

Training the Language Model

SFT

For SFT, you can use either LoRA (Low-Rank Adaptation) or full model fine-tuning. The primary repository for this process is https://github.com/LZY-the-boys/axolotl clone it and customize the yaml configuration file.

Multi-modal Training you can visit this: https://github.com/OpenAccess-AI-Collective/axolotl/tree/llava-train

LoRA-example you can vist: https://github.com/Facico/Chinese-Vicuna

Dependency:

  • transformers 4.36
  • peft 0.6.0
  • trl 0.7.2
  • bitsandbytes 0.41.2

DPO

DPO is a method used to optimize the model's performance based on human preferences. The repositories involved are:

Evaluation

Leaderboard:

Eval vision model

Leaderboard: TODO

Merging the Experts

Merging-Kit: https://github.com/arcee-ai/mergekit Twin-Merging: https://github.com/LZY-the-boys/Twin-Merging

Deploy UI

The final step involves deploying your fine-tuned LLM using vLLM, ensuring it is accessible through a user-friendly interface. This section will guide you through the deployment process, ensuring your model is ready for use in real-world applications.

vllm deploy: https://github.com/LZY-the-boys/vllm

UI: