evison/VIP5

VIP5: Towards Multimodal Foundation Models for Recommendation

PythonMIT

VIP5

VIP5: Towards Multimodal Foundation Models for Recommendation

Dependencies:

PyTorch 1.12
transformers
tqdm
numpy
sentencepiece
pyyaml
pip install git+https://github.com/openai/CLIP.git

Usage

Clone this repo
Download preprocessed data and image features from this Google Drive link, then unzip them into the data and features folders
Create snap and log folders to store VIP5 checkpoints and training logs:
```
mkdir snap log
```

Conduct parameter-efficient tuning with scripts in scripts folder, such as

CUDA_VISIBLE_DEVICES=0,1,2,3 bash scripts/train_VIP5.sh 4 toys 13579 vitb32 2 8 20

Citation

Please cite the following paper corresponding to the repository:

@inproceedings{geng2023vip5,
  title={VIP5: Towards Multimodal Foundation Models for Recommendation},
  author={Geng, Shijie and Tan, Juntao and Liu, Shuchang and Fu, Zuohui and Zhang, Yongfeng},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
  year={2023}
}

Acknowledgements

P5, VL-T5, PETER, and S3-Rec