/VIP5

VIP5: Towards Multimodal Foundation Models for Recommendation

Primary LanguagePythonMIT LicenseMIT

VIP5

VIP5: Towards Multimodal Foundation Models for Recommendation

Teaser

Dependencies:

Usage

  1. Clone this repo

  2. Download preprocessed data and image features from this Google Drive link, then unzip them into the data and features folders

  3. Create snap and log folders to store VIP5 checkpoints and training logs:

    mkdir snap log
    
  4. Conduct parameter-efficient tuning with scripts in scripts folder, such as

    CUDA_VISIBLE_DEVICES=0,1,2,3 bash scripts/train_VIP5.sh 4 toys 13579 vitb32 2 8 20
    

Citation

Please cite the following paper corresponding to the repository:

@inproceedings{geng2023vip5,
  title={VIP5: Towards Multimodal Foundation Models for Recommendation},
  author={Geng, Shijie and Tan, Juntao and Liu, Shuchang and Fu, Zuohui and Zhang, Yongfeng},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
  year={2023}
}

Acknowledgements

P5, VL-T5, PETER, and S3-Rec