An index of large language model (LLM) for recommendation systems.
🎉 News: Our LLM4Rec survey has been released. A Survey on Large Language Models for Recommendation
The related work and projects will be updated soon and continuously.
If our work has been of assistance to you, please feel free to cite our survey. Thank you.
@article{llm4recsurvey,
author = {Likang Wu and
Zhi Zheng and
Zhaopeng Qiu and
Hao Wang and
Hongchao Gu and
Tingjia Shen and
Chuan Qin and
Chen Zhu and
Hengshu Zhu and
Qi Liu and
Hui Xiong and
Enhong Chen},
title = {A Survey on Large Language Models for Recommendation},
journal = {CoRR},
volume = {abs/2305.19860},
year = {2023}
}
- The papers and related projects
- Single card (RTX 3090) debuggable generative language models that support Chinese corpus
Note: The tuning here only indicates whether the LLM model has been tuned.
Name | Scene | Tasks | Information | URL |
---|---|---|---|---|
Amazon Review | Commerce | Seq Rec/CF Rec | This is a large crawl of product reviews from Amazon. Ratings: 82.83 million, Users: 20.98 million, Items: 9.35 million, Timespan: May 1996 - July 2014 | link |
Amazon-M2 | Commerce | Seq Rec/CF Rec | A large dataset of anonymized user sessions with their interacted products collected from multiple language sources at Amazon. It includes 3,606,249 train sessions, 361,659 test sessions, and 1,410,675 products. | link |
Steam | Game | Seq Rec/CF Rec | Reviews represent a great opportunity to break down the satisfaction and dissatisfaction factors around games. Reviews: 7,793,069, Users: 2,567,538, Items: 15,474, Bundles: 615 | link |
MovieLens | Movie | General | The dataset consists of 4 sub-datasets, which describe users' ratings to movies and free-text tagging activities from MovieLens, a movie recommendation service. | link |
Yelp | Commerce | General | There are 6,990,280 reviews, 150,346 businesses, 200,100 pictures, 11 metropolitan areas, 908,915 tips by 1,987,897 users. Over 1.2 million business attributes like hours, parking, availability, etc. | link |
Douban | Movie, Music, Book | Seq Rec/CF Rec | This dataset includes three domains, i.e., movie, music, and book, and different kinds of raw information, i.e., ratings, reviews, item details, user profiles, tags (labels), and date. | link |
MIND | News | General | MIND contains about 160k English news articles and more than 15 million impression logs generated by 1 million users. Every news contains textual content including title, abstract, body, category, and entities. | link |
U-NEED | Commerce | Conversation Rec | U-NEED consists of 7,698 fine-grained annotated pre-sales dialogues, 333,879 user behaviors, and 332,148 product knowledge tuples. | link |
Some open-source and effective projects can be adpated to the recommendation systems based on Chinese textual data. Especially for the individual researchers !
Project | Year |
---|---|
baichuan-7B | 2023 |
YuLan-chat | 2023 |
Chinese-LLaMA-Alpaca | 2023 |
THUDM/ChatGLM-6B | 2023 |
FreedomIntelligence/LLMZoo Phoenix | 2023 |
bloomz-7b1 | 2023 |
LianjiaTech/BELLE | 2023 |
Hope our conclusion can help your work.