/ClickstreamDMM

A deep Markov model for clickstream analytics in online shopping

Primary LanguagePython

ClickstreamDMM: A deep Markov model for clickstream analytics in online shopping

The original implementation of the paper. You can cite the paper as below.

@inproceedings{ozyurt2022deep,
  title={A deep Markov model for clickstream analytics in online shopping},
  author={Ozyurt, Yilmazcan and Hatt, Tobias and Zhang, Ce and Feuerriegel, Stefan},
  booktitle={Proceedings of the ACM Web Conference 2022},
  pages={3071--3081},
  year={2022}
}

Dependencies

For a straight-forward use of ClickstreamDMM, you can install the required libraries from requirements.txt: pip install -r requirements.txt

Dataset

We made our experiments on a clickstream dataset provided by our partner company. Here, we release the completely anonymized pre-processed dataset for the benchmarking, which you can find in data folder.

To be aligned with our AttDMM ClickstreamDMM, you need to have the following files under each split directory. (e.g. data/splits0)

  1. Time-series of pages:
    1. timeseries_train.npy
    2. timeseries_val.npy
    3. timeseries_test.npy
  2. Time-series of log TSP (time spent on each page):
    1. delta_time_train_log.npy
    2. delta_time_val_log.npy
    3. delta_time_test_log.npy
  3. Time-series of cumulative time up to the page:
    1. cum_time_train.npy
    2. cum_time_val.npy
    3. cum_time_test.npy
  4. Static features:
    1. static_train.npy
    2. static_val.npy
    3. static_test.npy
  5. Purchase Labels:
    1. y_train.npy
    2. y_val.npy
    3. y_test.npy

Example Usage

For training: python main.py --cuda --experiments_main_folder experiments --experiment_folder default --log clickstreamdmm.log --save_model model --save_opt opt --checkpoint_freq 10 --eval_freq 10 --data_folder ./data/splits0

All the log files and the model checkpoints will be saved under current_dir/experiments_main_folder/experiment_folder/

for testing: python main.py --cuda --experiments_main_folder experiments --experiment_folder default --log clickstreamdmm_eval.log --load_model model_best --load_opt opt_best --eval_mode --data_folder ./data/splits0

Note that experiments_main_folder and experiment_folder have to be consistent with training so that the correct model is loaded properly. After testing is done, the prediction outputs can be found as current_dir/experiments_main_folder/experiment_folder/purchase_predictions_test.csv

For the full set of arguments, please check main.py .