This code is for the paper "Diversity-Enhanced Recommendation with Knowledge-Aware Devoted and Diverse Interest Learning", which is accepted by IJCNN 2023.
All experiments are run on a machine containing 256GB of RAM, and 4 NVIDIA 3090 or 4 NVIDIA A40 graphics cards in Ubuntu 20.04.
Due to the limitation of upload file size, this zip only contains the preprocessed dataset and training weights of MovieLens.
- Python 3.8
- CUDA 11.4
- pytorch 1.11
- numpy 1.20.3
- pandas 1.1.4
- torch-scatter 2.0.9
- sklearn 0.24.2
- networkx 2.6.3
-
Install all the requirements.
-
Train and evaluate the DERK using the Python script main.py.
To demonstrate the reproducibility of the best performance reported in our paper and faciliate researchers to track whether the model status is consistent with ours, we provide the best parameter settings (see utils/parser.py) and the log for our trainings.
For efficiency, we use multiple GPUs for training via DistributedDataParallel. To train our method with multiple GPUs on MovieLens-1M dataset, you can run
python main.py --dataset ml-1m --devices 0,1,2,3
If you want to specify only one GPU for training, you can run
python main.py --dataset ml-1m --devices 0
To fast reproduce the results on MovieLens-1M in our paper, you can run
python main.py --dataset ml-1m --test
We provide three processed datasets: MovieLens-1M, Alibaba-iFashion, and Amazon-book.
- You can find the full version of recommendation datasets via MovieLens, Alibaba-iFashion, and Amazon.
- We use the preprocessed Alibaba-iFashion and Amazon-Book datasets released by KGIN.
- On MovieLens-1M dataset, We follow KB4Rec to map items into Freebase entities via title matching if there is a mapping available. Then we can obtain
relation_list.txt
,entity_list.txt
andkg_final.txt
with remap ID (see preposess.py). - We counted the categories of items in each dataset and assigned IDs to them (see example of MovieLens-1M category_map.ipynb), and then corresponded the items to the categories to get
item_cate.txt
(see example of MovieLens-1M item_cate.ipynb).
MovieLens-1M | Alibaba-iFashion | Amazon-book | ||
---|---|---|---|---|
User-Item Interaction | #Users | 6,040 | 114,737 | 70,679 |
#Items | 3,706 | 30,040 | 24,915 | |
#Interactions | 734,564 | 1,380,510 | 652,514 | |
#Categories | 18 | 75 | 598 | |
Knowledge Graph | #Entities | 42,097 | 89,196 | 113,487 |
#Relations | 52 | 51 | 39 | |
#Triplets | 603,188 | 279,155 | 2,557,746 |
train.txt
- Train file.
- Each line is a user with her/his positive interactions with items: (
userID
anda list of itemID
).
test.txt
- Test file (positive instances).
- Each line is a user with her/his positive interactions with items: (
userID
anda list of itemID
). - Note that here we treat all unobserved interactions as the negative instances when reporting performance.
user_list.txt
- User file.
- Each line is a triplet (
org_id
,remap_id
) for one user, whereorg_id
andremap_id
represent the ID of such user in the original and our datasets, respectively.
item_list.txt
- Item file.
- Each line is a triplet (
org_id
,remap_id
,freebase_id
) for one item, whereorg_id
,remap_id
, andfreebase_id
represent the ID of such item in the original, our datasets, and freebase, respectively.
entity_list.txt
- Entity file.
- Each line is a triplet (
freebase_id
,remap_id
) for one entity in knowledge graph, wherefreebase_id
andremap_id
represent the ID of such entity in freebase and our datasets, respectively.
relation_list.txt
- Relation file.
- Each line is a triplet (
freebase_id
,remap_id
) for one relation in knowledge graph, wherefreebase_id
andremap_id
represent the ID of such relation in freebase and our datasets, respectively.
kg_final.txt
- KG triplets file.
- Each line is a triplet (
Entity remap_id
,Relation remap_id
,Entity remap_id
) for one head entity from knowledge graph.
category_list
- Categoty file.
- Each line is a triplet (
category
,cate_count
,cate_id
) for one category in dataset, wherecategory
,cate_count
andcate_id
is the original category, the number of this category in dataset and the ID of this category, respectively.
item_cate.txt
- Item's category file
- Each line is a triplet (
item_id
,cate_id
) for one item, whereitem_id
andcate_id
represent the ID of such item in our datasets and category, respectively.