This repository includes the implementation for Deconfounded Causal Collaborative Filtering
Paper: Deconfounded Causal Collaborative Filtering
Paper Link: https://dl.acm.org/doi/full/10.1145/3606035
Environment requirements can be found in ./requirement.txt
-
Electronics and CDs and Vinyl: The origin dataset can be found here.
-
Yelp: The origin dataset can be found here.
-
The data processing code can be found in
./src/data_preprocessing/
The feature_embedding is generated by a pre-trained sentence embedding models. We applied the pre-trained paraphrase-distilroberta-base-v1 sentence embedding model in a public transformer implementation: https://github.com/UKPLab/sentence-transformers
Specifically, we take the average of embedding of the 'title', 'description' and 'feature' as feature_embedding. We used pre-trained model to encode sentences separately and manually compute the average.
To generate the exposure probability in ips_expo_prob.npy, we train an IPSBiasedMF model and save the full predicted user-item matrix as the exposure probability.
After generating feature_embedding and exposure probability, we run the code to train DCCF.
For example:
# DCCF on Electronics dataset
> cd ./src/
> python ./main.py --rank 1 --model_name DCCF --optimizer Adam --lr 0.001 --dataset Electronics --metric ndcg@5,recall@5,precision@5 --gpu 0 --epoch 100 --test_neg_n 1000
@article{xu2023deconfounded,
title={Deconfounded causal collaborative filtering},
author={Xu, Shuyuan and Tan, Juntao and Heinecke, Shelby and Li, Vena Jia and Zhang, Yongfeng},
journal={ACM Transactions on Recommender Systems},
volume={1},
number={4},
pages={1--25},
year={2023},
publisher={ACM New York, NY}
}