This repository contains source code for various experiments to explore the feasibility of using a multi-modal approach to generate personalized product recommendations using LLMs. The model is trained on a dataset of customer purchase history with detailed metadata of purchased items with corresponding images.
pip install -r requirements.txt
Our project uses H&M Personalized Fashion Recommendations dataset, released as part of the Kaggle Competition by H&M. The dataset consists of customer purchase history over time with detailed metadata of purchased items, product images, and customer demographics. To read more, visit H&M Personalized Fashion Recommendations.
Download the dataset as follows:
Requirements:
- 100 GB free disk space
- 8 CPUs with 32 GB RAM
Before downloading, setup Kaggle API Credentials. Navigate to the Accounts page of Kaggle and go to the "API" section and select the "Create New API Token". This will trigger the download of kaggle.json file containing your API credentials. Place the file at ~/.kaggle/kaggle.json
.
# Download dataset
kaggle competitions download -c h-and-m-personalized-fashion-recommendations
# Create a directory to unzip data
mkdir ./data
# Unzip data
unzip h-and-m-personalized-fashion-recommendations.zip -d ./data
The downloaded dataset requires some preprocessing before we can use it. It involves creating train/val/test splits, removing purchased items with no corresponding images, etc.
Run the following commands to process the data. Processing will take about 10 minutes.
cd src
# --data_root: directory containing articles.csv, customers.csv, images, etc.
python process_data.py --data_root ../data
Now run the following commands to prepare the processed data for training. Preprocessing will take about 30 minutes.
cd src
python prepare_data_for_training.py --data_root ../data
Download VQGAN checkpoint pre-trained on ImageNet. See https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/ for more details.
Download LPIPS pre-trained model. See https://heibox.uni-heidelberg.de/f/607503859c864bc1b30b for more details.
Preprocess the downloaded pre-trained model:
python process_pretrained_vqgan.py
python -m lightning.pytorch.utilities.upgrade_checkpoint ./pretrained/vqgan.ckpt
Prepare data for training:
python prepare_vqgan_data.py
Start fine-tuning:
python trainer_vqgan.py
To evaluate VQGAN model, run the notebook notebooks/evaluate_vqgan.ipynb
.
Now we encode product images using VQGAN. This will take about an hour.
python encode_data.py
To start training:
cd src
python trainer_bart.py --data_dir ../data --output_dir ../logs
To evaluate BART model, run the notebook notebooks/evaluate_bart.ipynb
.