LightGCN+

This is a tensorflow implementation for my Master of Science Thesis:

Master of Science Thesis Paper:

Eczaneler İçin Kullanıcı Tabanlı Müstahzar Öneri Sistemi (User-Based Preparation/Drug Recommendation System for Pharmacies) - 2022

Introduction

In this study, a hybrid recommendation system has been proposed that will increase the efficiency of the systems in pharmacies. A new system which called LightGCN+ aims to improve the LightGCN by adding item-item relations next to user-item relations. In addition three datasets have been proposed about drug/preparation.

Environment Requirement

The code has been tested running under Python 3.6.5. The required packages are as follows:

tensorflow == 1.11.0
numpy == 1.14.3
scipy == 1.1.0
sklearn == 0.19.1
cython == 0.29.15

Examples to run a 3-layer LightGCN

The instruction of commands has been clearly stated in the codes (see the parser function in LightGCN/utility/parser.py).

Drug-Relation-90

Command

python LightGCN.py --dataset Drug-Relation-90 --regs [1e-4] --embed_size 64 --layer_size [64,64,64] --lr 0.001 --batch_size 2048 --epoch 1000

Example Output log (Not Real):

eval_score_matrix_foldout with cpp
n_users=12765, n_items=8415
n_interactions=33217
n_train=23000, n_test=10217, sparsity=0.00031
      ...
Epoch 1 [20.3s]: train==[0.46925=0.46911 + 0.00014]
Epoch 2 [25.1s]: train==[0.21866=0.21817 + 0.00048]
      ...
Epoch 879 [81.6s + 31.3s]: test==[0.13271=0.12645 + 0.00626 + 0.00000], recall=[0.18201], precision=[0.05601], ndcg=[0.15555]
Early stopping is trigger at step: 5 log:0.18201370537281036
Best Iter=[38]@[32829.6]	recall=[0.40890], precision=[0.02151], ndcg=[0.20539]

Drug-Relation-180

Command

python LightGCN.py --dataset Drug-Relation-180 --regs [1e-4] --embed_size 64 --layer_size [64,64,64] --lr 0.001 --batch_size 2048 --epoch 1000

Example Output log (Not Real):

eval_score_matrix_foldout with cpp
n_users=19210, n_items=9793
n_interactions=57763
n_train=27000, n_test=23763, sparsity=0.00031
      ...
Epoch 1 [20.3s]: train==[0.46925=0.46911 + 0.00014]
Epoch 2 [25.1s]: train==[0.21866=0.21817 + 0.00048]
      ...
Epoch 879 [81.6s + 31.3s]: test==[0.13271=0.12645 + 0.00626 + 0.00000], recall=[0.18201], precision=[0.05601], ndcg=[0.15555]
Early stopping is trigger at step: 5 log:0.18201370537281036
Best Iter=[38]@[32829.6]    recall=[0.48135], precision=[0.02727], ndcg=[0.21539]

Drug-Relation-270

Command

python LightGCN.py --dataset Drug-Relation-270 --regs [1e-4] --embed_size 64 --layer_size [64,64,64] --lr 0.001 --batch_size 2048 --epoch 1000

Example Output log (Not Real):

eval_score_matrix_foldout with cpp
n_users=22701, n_items=10282
n_interactions=75480
n_train=40000, n_test=35480, sparsity=0.00032
      ...
Epoch 1 [20.3s]: train==[0.46925=0.46911 + 0.00014]
Epoch 2 [25.1s]: train==[0.21866=0.21817 + 0.00048]
      ...
Epoch 879 [81.6s + 31.3s]: test==[0.13271=0.12645 + 0.00626 + 0.00000], recall=[0.18201], precision=[0.05601], ndcg=[0.15555]
Early stopping is trigger at step: 5 log:0.18201370537281036
Best Iter=[38]@[32829.6]    recall=[0.51632], precision=[0.02945], ndcg=[0.22551]

NOTE : the duration of training and testing depends on the running environment.

Dataset

We provide three processed datasets: Drug-Relation-90, Drug-Relation-180 and Drug-Relation-270.

train.txt
- Train file.
- Each line is a user with her/his positive interactions with items: userID\t a list of itemID\n.
test.txt
- Test file (positive instances).
- Each line is a user with her/his positive interactions with items: userID\t a list of itemID\n.
- Note that here we treat all unobserved interactions as the negative instances when reporting performance.
item_item.txt
- Item-Item relation file.
- Each line is a item with its positive interactions with other items: itemID\t a list of itemID\n.
user_list.txt
- User file.
- Each line is a triplet (org_id, remap_id) for one user, where org_id and remap_id represent the ID of the user in the original and our datasets, respectively.
item_list.txt
- Item file.
- Each line is a triplet (org_id, remap_id) for one item, where org_id and remap_id represent the ID of the item in the original and our datasets, respectively.

mcakar-dev/LightGCN-plus