STSP

This repository is the implementation of STSP

STSP is a model proposed in 'Point-of-Interest Recommendation for Users-Businesses with Uncertain Check-ins'. It is a novel framework, equipped with category- and location-aware encoders, which is designed to achieve next category and POI prediction with uncertain check-ins by fusing rich context features.

File Descriptions

data/
- CAL,CHA,PHO. The files for the three cities are similar in structure, and we take CAL as an example for demonstration.
  - all_POI_CAL.csv: all POIs of Calgary;
  - all_collective_POI_info_CAL.csv: raw collective POIs information of Calgary;
  - check-ins_CAL.csv: reindexed check-ins information of Calgary;
category result/
- CAL
  - user_rep_CAL: user embedding folder of category encoder module, there are many .npy files;
  - L2_id_mapping_CAL.csv: category id mapping file;
  - reindex_data_CAL.csv: reindexed and filtered checkin file;
  - result_CAL.txt: the original category recommendation result of category encoder module;
  - train_CAL.txt: the original train data of category encoder;
main.py: main file;
data_preprocess.py: data preprocess file;
category_encoder.py: category encoder module;
POI_encoder.py: POI encoder module.

More Experimental Settings

Environment
- Our proposed STSP and the deep learning based baseline, namely MCARNN are implemented using Tensorflow 2.0.0, with Python 3.6.9 from Anaconda 4.7.12. All the conventional baselines, including MostPop, CateMF, LBPR are implemented with Python 3.6.9. For HCT, we directly use the source code provided by the authors. All the experiments are carried out on a machine with Windows 10, Intel CORE i7-8565U CPU and 16G RAM. The following packages are needed (along with their dependencies):
  - tensorflow==2.0.0
  - numpy==1.17.3
  - pandas=0.25.3
  - keras==2.3.1
Data Preprocessing
- Following state-of-the-arts, we filter out users and POIs with less than 10 check-in records. For each user, we split her check-in records into sequences by day, where the earlier 80% of her sequences are used as training set; the latest 10% of her sequences are test sets; and the rest 10% in the middle is treated as validation set to help tune the hyper-parameters.

Hyper-parameter Settings

Tables (1-5) summarize the optimal settings for the hyper-parameters of all the methods.

Table1: Hyper-parameter settings for our STSP.

Hyper-paramters	CHA	PHO	CAL
learning rate for category prediction task η	0.0001	0.0001	0.0001
learning rate for POI prediction task η	0.0001	0.0001	0.001
regularization term λ	0.0025	0.0025	0.0025
number of recurrent layers	3	3	3
embedding size D	120	100	120
category importance α	0.4	0.5	0.5

Table2: Hyper-parameter settings for CateMF.

Hyper-paramters	CHA	PHO	CAL
learning rate η	0.001	0.001	0.001
embedding size D	100	100	120
regularization term λ	0.1	0.1	0.1

Table3: Hyper-parameter settings for LBPR.

Hyper-paramters	CHA	PHO	CAL
learning rate η	0.001	0.001	0.001
embedding size D	100	100	120
regularization term λ	0.01	0.01	0.01
list size α	2	2	2

Table4: Hyper-parameter settings for MCARNN.

Hyper-paramters	CHA	PHO	CAL
learning rate η	0.01	0.01	0.01
embedding size D	200	200	200
regularization term λ	0.01	0.01	0.01
number of recurrent layers	2	2	2
weighting factor (λ1, λ2, λ3)	(1, 0.5, 0.05)	(1, 0.5, 0.05)	(1, 0.5, 0.05)

Table5: Hyper-parameter settings for HCT.

Hyper-paramters	CHA	PHO	CAL
learning rate η	0.001	0.001	0.001
embedding size D	200	200	200
window size ω	2	2	2
weights of categories at layers 1 and 2 (α1, α2)	(0.2, 0.8)	(0.2, 0.8)	(0.2, 0.8)

Training/Testing Time
- The training and testing time of our STSP on the three real-world datasets are listed in Table 6.
Table 6: Training and testing time (seconds) of STSP.

Training Time Testing Time

CHA 509.05 433.60

PHO 523.57 798.34

CAL 111.61 79.10

	Training Time	Testing Time
CHA	509.05	433.60
PHO	523.57	798.34
CAL	111.61	79.10

How To Run

$ python main.py (note: use -h to check optional arguments)

STSP2020/STSP

STSP

File Descriptions

More Experimental Settings

How To Run