/OpenP5

OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems

Primary LanguagePythonApache License 2.0Apache-2.0

OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems

Introduction

This repo presents OpenP5, an open-source platform for LLM-based Recommendation development, finetuning, and evaluation.

Paper: OpenP5: Benchmarking Foundation Models for Recommendation
Paper link: https://arxiv.org/pdf/2203.13366.pdf

A relevant repo regarding how to create item ID for recommendation foundation models is available here:

Paper: How to Index Item IDs for Recommendation Foundation Models
Paper link: https://arxiv.org/pdf/2305.06569.pdf
GitHub link: https://github.com/Wenyueh/LLM-RecSys-ID

News

-[2023.9.16] OpenP5 now supports both T5 and LLaMA-2 backbone LLMs.

-[2023.6.10] OpenP5 now supports 10 datasets and 3 item ID indexing methods for both sequential recommendation and straightforward recommendation tasks.

Environment

Environment requirements can be found in ./environment.txt

Data Statistics

The statistics of the selected ten datasets can be found below:

Datasets ML-1M Yelp LastFM Beauty ML-100K
#Users 6,040 277,631 1,090 22,363 943
#Items 3,416 112,394 3,646 12,101 1,349
#Interactions 999,611 4,250,483 52,551 198,502 99,287
Sparsity 95.16% 99.99% 98.68% 99.93% 92.20%
Datasets Clothing CDs Movies Taobao Electronics
#Users 39,387 75,258 123,960 6,104 192,403
#Items 23,033 64,443 50,052 4,192 63,001
#Interactions 278,677 1,697,533 1,697,533 46,337 1,689,188
Sparsity 99.97% 99.96% 99.97% 99.82% 99.99%

Usage

Download the data from Google Drive link, and put them into ./data folder.

Run the following command to generate all data

sh generate_dataset.sh

The training command can be found in ./command folder. Run the command such as

cd command
sh ML1M_t5_sequential.sh

Checkpoint

The evaluation command can be found in ./test_command folder. Run the command such as

cd ./test_command
sh ML1M_t5_sequential.sh

Citation

@article{xu2023openp5,
  title={OpenP5: Benchmarking Foundation Models for Recommendation},
  author={Shuyuan Xu and Wenyue Hua and Yongfeng Zhang},
  journal={arXiv:2306.11134},
  year={2023}
}
@article{hua2023index,
  title={How to Index Item IDs for Recommendation Foundation Models},
  author={Hua, Wenyue and Xu, Shuyuan and Ge, Yingqiang and Zhang, Yongfeng},
  journal={SIGIR-AP},
  year={2023}
}