  1. Li, Shiwei, et al. "Embedding Compression in Recommender Systems: A Survey." ACM Computing Surveys (2023). link
  2. Zheng, R., Qu, L., Cui, B., Shi, Y., & Yin, H. (2023). Automl for deep recommender systems: A survey. ACM Transactions on Information Systems, 41(4), 1-38. link

Before 2023

  1. D. kumar Bokde, S. Girase, and D. Mukhopadhyay, “Role of matrix factorization model in collaborative filtering algorithm: A survey. IJAFRC 2015. link
  2. M. H. Abdi, G. Okeyo, and R. W. Mwangi, “Matrix factorization techniques for context-aware collaborative filtering recommender systems: A survey,” 2018. link
  3. W.-S. Chen, Q. Zeng, and B. Pan, “A survey of deep nonnegative matrix factorization,” Neurocomputing, vol. 491, pp. 305–320, 2022. link
  4. A. K. Qin, V. L. Huang, and P. N. Suganthan, “Differential evolution algorithm with strategy adaptation for global numerical optimization,” IEEE transactions on Evolutionary Computation, vol. 13, no. 2, pp. 398–417, 2008.
  5. B. Chen, X. Zhao, Y. Wang, W. Fan, H. Guo, and R. Tang, “Automated machine learning for deep recommender systems: A survey,” ArXiv, vol. abs/2204.01390, 2022. link
  6. J. Yu, H. Yin, X. Xia, T. Chen, J. Li, and Z. Huang, “Selfsupervised learning for recommender systems: A survey,” TKDE 2022. link
  7. N. M. Nasrabadi and R. A. King, “Image coding using vector quantization: A review,” IEEE Transactions on communications, vol. 36,no. 8, pp. 957–971, 1988. link
  8. S. T. Ali and M. Englis, “Quantization methods: a guide for physicists and analysts,” Reviews in Mathematical Physics, vol. 17, no. 04, pp. 391–490, 2005. link
  9. T.-C. Lu and C.-C. Chang, “A survey of vq codebook generation.” J. Inf. Hiding Multim. Signal Process., vol. 1, no. 3, pp. 190–203, 2010. link
  10. A. Ramanan and M. Niranjan, “A review of codebook models in patchbased visual object recognition,” Journal of Signal Processing Systems, vol. 68, no. 3, pp. 333 352, 2012. link
  11. N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and fairness in machine learning,” ACM Computing Surveys (CSUR), vol. 54, no. 6, pp. 1–35, 2021. link


Name Paper code
NSVD/KDDW “Improving regularized singular value decomposition for collaborative filtering,” in Proceedings of KDD cup and workshop, vol. 2007. py
SVD++/KDD “Factorization meets the neighborhood: a multifaceted collaborative filtering model,” in Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2008 py
SVDfeature/JMLR “Svdfeature: a toolkit for feature-based collaborative filtering,” The Journal of Machine Learning Research, 2012 toolkit
DELF/IJCAI “Delf: A dual-embedding based deep latent factor model for recommendation.” in IJCAI, -
SLIM/ICDM “Slim: Sparse linear methods for top-n recommender systems,” in 2011 IEEE 11th international conference on data mining. IEEE, 2011, pp. 497–506. py
FISM/KDD “Fism: factored item similarity models for top-n recommender systems,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013, pp. 659–667. tf
NAIS/TKDE “Nais: Neural attentive item similarity model for recommendation,” IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 12, pp. 2354–2366, 2018 official(tf)
BiasSVD “Matrix factorization techniques for recommender systems,” Computer, vol. 42, no. 8, pp. 30–37, 2009. py
TimeSVD/KDD “Collaborative filtering with temporal dynamics,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009. -
SGNS/Neurips “Neural word embedding as implicit matrix factorization,” Advances in neural information processing systems, vol. 27, 2014. py
ConvMF/Neurips “Convolutional matrix factorization for document context-aware recommendation,” in Proceedings of the 10th ACM conference on recommender systems, 2016,pp. 233–240. torch
FM/ICDM “Factorization machines,” in 2010 IEEE International conference on data mining. IEEE, 2010, pp. 995–1000. -
FFM/Recsys “Field-aware factorization machines for ctr prediction,” in Proceedings of the 10th ACM conference on recommender systems, 2016, pp. 43–50. tf
FNN/ECIR “Deep learning over multi-field categorical data,” in European conference on information retrieval. Springer, 2016, pp. 45–57. tf
PNN/ICDM “Product-based neural networks for user response prediction,” in 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016, pp. 1149–1154. tf
NeuMF/SIGIR “Neural factorization machines for sparse predictive analytics,” in Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, 2017, pp. 355–364. torch
AFM “Attentional factorization machines: Learning the weight of feature interactions via attention networks” offcial
Wide & deep/DLRS “Wide & deep learning for recommender systems,” in Proceedings of the 1st workshop on deep learning for recommender systems, 2016, pp. 7–10 offcial, torch
DeepFM “DeepFM: a factorizationmachine based neural network for ctr prediction”,2017 torch,tf
DCN “Deep & cross network for ad click predictions,” in Proceedings of the ADKDD’17, 2017, pp. 1–7. torch, tf
xDeepFM/KDD “xdeepfm: Combining explicit and implicit feature interactions for recommender systems,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018. official
Autoint/CIKM “Autoint: Automatic feature interaction learning via self-attentive neural networks,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019 official


Name Paper code
Hashing trick “Feature hashing for large scale multitask learning,” in Proceedings of the 26th annual international conference on machine learning, 2009, pp. 1113–1120. C++
Bloom embedding/Recsys “Getting deep recommenders fit: Bloom embeddings for sparse binary input/output networks,” in Proceedings of the Eleventh ACM Conference on Recommender Systems, 2017, pp. 279–287 -
Hash embeddings/Neurips “Hash embeddings for efficient word representations,” Advances in neural information processing systems, vol. 30, 2017. tf
Hybrid hashing /Recsys “Model size reduction using frequency based double hashing for recommender systems,” in Fourteenth ACM Conference on Recommender Systems, 2020, pp. 521–526. -
Q-R trick/KDD “Compositional embeddings using complementary partitions for memory-efficient recommendation systems,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 165–175. official(torch)
BH/CIKM “Binary code based hash embedding for web-scale applications,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 3563–3567. -
DHE/KDD “Learning to embed categorical features without embedding tables for recommendation,” arXiv preprint arXiv:2010.10784, 2020. -


Name Paper Remarks Year Proceeding code
NIS/KDD Neural input search for large scale recommendation models The first work of embedding dimension search (EDS) by reinforcement learning, where the controller directly selects the embedding size. 2020 KDD -
ESAPN “Automated embedding size search in deep recommender systems” Similar to the NIS, the controller decides whether to increase or maintain the embedding size. 2020 SIGIR torch
DARTS “Darts: Differentiable architecture search” The foundation of the gradient-based embedding size searching methods. 2018 ICLR torch
DNIS “Differentiable neural input search for recommender systems,” To optimize the framework for EDS by gradient information, it introduces the soft selection layer, where it conducts a weighted summation of the candidate embedding size. 2020 -
AutoSrh “AutoSrh: An Embedding Dimensionality Search Framework for Tabular Data Prediction” It follows the pipeline of DNIS to make tabular data prediction. 2021 CIKM -
AutoEMB “Autoemb: Automated embedding dimensionality search in streaming recommendations,” To improve the hard selection in ESAPN, it also designed the soft selection layer as a weighted sum operation. 2021 ICDM -
AutoDim "Autodim: Field-aware embedding dimension searching recommender systems," It extends the input to various feature fields and employs Gumbel-softmax tricks as a soft selection layer. 2021 WWW -
AMTL "Learning Effective and Efficient Embedding via an Adaptively-Masked Twins-based Layer" It aims to learn a mask matrix to tailor the embedding size. 2021 CIKM -
SSEDS “Single-shot embedding dimension search in a recommender system,” Considering the high training costs of AMTL, It learns the mask matrix by computing the saliency score. 2022 SIGIR -
RULE "Learning elastic embeddings for customizing on-device recommenders," It introduces the evolutionary algorithms to conduct EDS. 2021 KDD -
PEP “Learnable embedding sizes for recommender systems,” Different from AMTL, PEP directly prune the embedding matrix by learning a threshold automatically. 2021 ICLR torch
ANT “Anchor & transform: Learning sparse embeddings for large vocabularies,” It combines learnable anchor embedding matrices to form an optimal embedding matrix. 2020 ICLR -
AutoDis “An embedding learning framework for numerical features in ctr prediction,” Different from the hard selection of anchor embedding in ANT, it designs a differentiable automatic discretization network to execute a soft selection of meta-embeddings. 2021 KDD official


Name Paper Remark Year Proceedings code
SimCLR “A simple framework for contrastive learning of visual representations,” in International conference on machine learning. PMLR, 2020, pp. 1597–1607. following SimCLR, it adopts the InfoNCE loss to maximize the agreement between positive samples and minimize the agreement between negative samples. 2020 ICML torch
SGL “Self-supervised graph learning for recommendation,” in Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, 2021, pp. 726–735. it is the first work utilizing contrastive learning for recommendation by designing an auxiliary task to supplement the recommendation model. 2021 SIGIR torch
HHGR “Double-scale self-supervised hypergraph learning for group recommendation,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 2557–2567. it tackles the issue of distorted local structures due to random node dropout by propose a dual-scale node-dropping strategy. 2021 CIKM official(torch)
CCDR/KDD “Contrastive cross-domain recommendation in matching,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 4226–4236. As a cross-domain recommendation method, it transfers valuable node embeddings from a well-equipped source domain to a less-equipped target domain. 2022 KDD official(tf)
PCRec “Pre-training graph neural network for cross domain recommendation,” in 2021 IEEE Third International Conference on Cognitive Machine Intelligence(CogMI). IEEE, 2021, pp. 140–145 it addresses cross-domain tasks by first pre-training on source domain and, then, fine-tuning on the target domain. 2021 CogMI -
DCL “Contrastive learning for recommender system,” arXiv preprint arXiv:2101.01317, 2021. it challenges the assumption of interest in only sampled negative items by conducting subgraph sampling. 2021 -
CLS4Rec “Contrastive learning for sequential recommendation,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022, pp. 1259–1273. it deals with the sequential recommendation task under contrastive learning paradigm. 2022 ICDE torch
CoSeRec Z. Liu, Y. Chen, J. Li, P. S. Yu, J. McAuley, and C. Xiong, “Contrastive self-supervised sequential recommendation with robust augmentation,” arXiv preprint arXiv:2108.06479, 2021. Similar to CL4SRec, CoSeRec uses item substitution and insertion for data augmentation. 2021 torch
Bert4rec “Bert4rec:Sequential recommendation with bidirectional encoder representations from transformer,” in Proceedings of the 28th ACM international conference on information and knowledge management, 2019, pp. 1441–1450. it first utilizes the BERT for sequential recommendation, which applies item masking on the sample sequence and trains the model to predict the masked item. 2019 CIKM tf
Unbert “Unbert: User-news matching bert for news recommendation.” in IJCAI, 2021, pp. 3356–3362. it extends the BERT model for news recommendation. 2021 IJCAI -
UPRec “Uprec: User-aware pre-training for recommender systems,” it explores extending the BERT4Rec model to handle heterogeneous information such as user attributes and social networks. 2021 -
PeterRec “Parameter-efficient transfer from sequential behaviors for user modeling and recommendation,” it conducts learning-to-learn idea on cross-domain recommendation task. 2020 SIGIR official(tf).
ShopperBERT “One4all user representation for recommender systems in ecommerce,” it takes advantage of the rich user behaviors to pre-train the BERT model based on nine auxiliary tasks for the general embedding. 2021 -
G-BERT/IJCAI “Pre-training of graph augmented transformers for medication recommendation,” it obtains sequences from graph data and then employs generative methods using the sequence data. 2019 official(torch)
PMGT/ACM MM “Pre-training graph transformer with multimodal side information for recommendation,” it focuses on multimodal recommendations, handling nodes with diverse information in a multimodal graph. 2021 ACMMM -
PT-GNN/WSDM “Pre-training graph neural networks for cold-start users and items representation,” it addresses cold-start nodes with few neighbors using a meta-learning approach. 2021 WSDM official(tf)


Name Paper Remark Year Proceeding Code
DPR Discrete personalized ranking for fast collaborative filtering from implicit feedback. DPR maps embeddings to binary codes by approximately optimizing AUC through a least-squares surrogate loss function,. 2017 AAAI -
DDL Discrete deep learning for fast content-aware recommendation. DDL uses a bag-of-words model to learn the embedding of the text of an item and minimizes the gap between it and the corresponding binary code. 2018 WSDM MATLAB
PQ Product quantization for nearest neighbor search. PQ decomposes the high-dimensional feature space $\mathbb{R}^{D}$ into a cartesian product $\mathcal C={C}^{1} \times \cdots \times \mathcal{C}^{M} \in \mathbb{R}^{M \times K}$ of lower-dimensional subspaces. 2010 TPAMI py
OPQ Optimized product quantization. Optimized Product Quantization (OPQ) uses the rotation matrix $R \in \mathbb{R}^{M \times M}$ to optimize the decomposition of subspaces to reduce the correlation between subspaces. 2013 TPAMI py
AQ Additive quantization for extreme vector compression. Additive Quantization (AQ) \cite{babenko2014additive} adopts a different strategy from PQ, it directly assigns $M$ codebooks to the whole high-dimensional space, and the $M$ compressed vectors are summed to obtain the quantization embedding. 2014 CVPR -
DPQ Differentiable product quantization for end-to-end embedding compression. DPQ proposes softmax-based and centroid-based methods to minimize reconstruction loss approximately between raw embeddings and quantization embeddings. This approach leads to higher-quality quantization embeddings compared to unsupervised PQ algorithms. 2020 ICML official(tf)
Lightrec Lightrec: A memory and search-efficient recommender system. LightRec enhances reconstruction loss with two functions, minimizing differences in user-item ratings pre- and post-quantization, as well as alterations in recommendation ranking. 2020 WWW official(tf)
PQCF Product quantized collaborative filtering. Product Quantized Collaborative Filtering (PQCF) challenges separate user and item quantization using PQ. Misaligned coordinates make this suboptimal. PQCF minimizes rating prediction loss, moving from Euclidean to the inner product space. 2020 TKDE -
Distill-vq Distill-vq: Learning retrieval oriented vector quantization by distilling knowledge from dense embeddings. Distill-VQ inspired by knowledge distillation, sets its objective as a similarity function measuring differences in relevance score distributions between teacher and student models (e.g., KL divergence). 2020 SIGIR torch
MOPQ Matching-oriented product quantization for ad-hoc retrieval. Matching-oriented Product Quantization (MoPQ) shows that better quantization reconstruction doesn't always mean better downstream performance. MoPQ improves accuracy through contrastive learning, modeling query-quantization matching via multinoulli process. 2021 EMNLP torch
xLightFM xlightfm:Extremely memory-efficient factorization machine. xLightFM assigns different codebooks to the features of different fields based on DQN. 2021 SIGIR official(torch)
Online PQ Online product quantization. Online PQ maintains a sliding window for data processing, continuously applying K-means clustering for new codewords. 2018 TKDE -
Online OPQ Online optimized product quantization. Online OPQ extends to streaming data by solving the orthogonal procrustes problem to ensure subspace orthogonality during codebook updates. 2020 ICDM -
Online AQ Online additive quantization. Online AQ extends AQ, maintaining consistent objective functions. For streaming data adaptability, Online AQ derives codebook update strategies and related regret bounds via linear regression closed solutions and matrix inversion lemma. 2021 KDD -


Name Paper Remark Year Proceeding Code
Birank Birank: Towards ranking on bipartite graphs. —— 2016 TKDE py
Deepwalk Deepwalk: Online learning of social representations. Deepwalk is a classic graph representation learning method, which is designed based on random walk. 2014 KDD nx
APP Scalable graph embedding for asymmetric proximity. Asymmetric Proximity Preserving (APP) graph embedding points out that in many downstream recommender system applications, the nodes in the graph do not have symmetry. 2017 AAAI toolkit
LINE Line: Large-scale information network embedding. LINE considers both first-order similarity and second-order similarity of nodes 2015 WWW official
HyperSoRec Exploiting hyperbolic user and item representations with multiple aspects for social-aware recommendation. HyperSoRec develops a hyperbolic mapping layer to map graph embedding in Euclidean space to hyperbolic space. 2021 TOIS -
HGCN Hyperbolic graph convolutional neural networks HGCN show that tree graph embedding in hyperbolic space can achieve better performance than its counterpart in Euclidean space which may suffer from severe distortion. 2019 Neurips torch
M2GRL M2grl: A multitask multi-view graph representation learning framework for web-scale recommender systems. M2GRL proposes to combine homogeneous graphs of multiple views to enhance sparse features and enrich node information. 2020 KDD -
DGENN Dual graph enhanced embedding neural network for ctr prediction. DGENN uses more than one view of homogeneous graphs to learn embeddings jointly. 2021 KDD -
Star-GCN Star-gcn: Stacked and reconstructed graph convolutional networks for recommender systems. It simulates embedding new nodes with a masked vector, training the model to reconstruct embeddings. 2019 IJCAI MXNET
NGCF Neural graph collaborative filtering. NGCF~\cite{wang2019neural} addresses GC-MC's drawback by incorporating node features in its embeddings. 2019 SIGIR code
UltraGCN Ultragcn: ultra simplification of graph convolutional networks for recommendation. UltraGCN removes the multi-layer messaging process by directly optimizing the cosine similarity of nodes and neighbors to capture the higher-order collaborative signals between users and items, and it uses a negative sampling strategy to avoid oversmoothing. 2021 CIKM torch
CSE Collaborative similarity embedding for recommender systems. CSE suggests enhancing embeddings by incorporating higher-order similarities among nodes of the same type. 2019 WWW code
BiNE Learning vertex representations for bipartite networks. —— 2020 TKDE official(py)
LightGCN Lightgcn:Simplifying and powering graph convolution network for recommendation. LightGCN argues that as the inputs of the user and item stem from ID embeddings lacking semantic information, there is no need for nonlinear transformations. 2020 SIGIR official(torch)
PGE Learning graph-based embedding for time-aware product recommendation. PGE considers temporal decay for item weights 2017 CIKM -
GEM Joint eventpartner recommendation in event-based social networks. —— 2018 ICDE -
DiffNet A neural influence diffusion model for social recommendation. DiffNet tackles social and user-item networks separately. 2019 SIGIR official(tf)
GraphRec Graph neural networks for social recommendation. GraphRec employs GAT to better model real-world social influence by weighing friends' influence based on the similarity of their initial embeddings. 2019 WWW official(torch)
Diffnet++ Diffnet++: A neural influence and interest diffusion network for social recommendation. DiffNet++ builds upon DiffNet by introducing a unified framework that considers both social networks and user-item bipartite graphs. 2020 TKDE official(tf)
EGES Billion-scale commodity embedding for e-commerce recommendation in alibaba. EGES leverages related side information like brand to aggregate nodes' embeddings for obtaining item embeddings in heterogeneous graphs. 2018 KDD torch
DANSER Dual graph attention networks for deep latent representation of multifaceted social effects in recommender systems. DANSER introduces a dual GAT to capture both static and dynamic influence, acknowledging that a user's impact on friends can vary based on items, leading to more realistic embeddings. 2019 WWW official(tf)
TransGRec Learning to transfer graph embeddings for inductive graph based recommendation. —— 2020 SIGIR -
GHL Gated heterogeneous graph representation learning for shop search in e-commerce. —— 2020 CIKM -
TransE Translating embeddings for modeling multi-relational data. Translation-based models used to learn entity embeddings。 2013 Neurips code
TransH Knowledge graph embedding by translating on hyperplanes. Translation-based models used to learn entity embeddings。 2014 AAAI code
TransR Learning entity and relation embeddings for knowledge graph completion. Translation-based models used to learn entity embeddings。 2015 AAAI code
KGAT Kgat: Knowledge graph attention network for recommendation. KGAT apply GAT to produce higher-quality embeddings in knowledge graph. 2019 KDD tf
KGIN Learning intents behind interactions with knowledge graph for recommendation. KGCN apply GCN and GAT respectively to produce higher-quality embeddings. 2021 WWW torch
IHGNN Ihgnn: Interactive hypergraph neural network for personalized product search. IHGNN enhances node embeddings through hypergraphs, revealing higher-order interaction patterns in user query histories 2022 WWW torch
HGNN Hypergraph neural networks Hypergraph GNNs 2019 AAAI torch
HyperGCN Hypergcn: A new method for training graph convolutional networks on hypergraphs. Hypergraph GNNs 2019 Neurips torch
HyperGroup Hierarchical hyperedge embedding-based representation learning for group recommendation. HyperGroup targets group recommendation, addressing potential misalignment between user and group preferences by representing groups as hyperedges. 2021 TOIS -
HEMR Music recommendation via hypergraph embedding. HEMR focuses on music recommendation using hypergraph embeddings. It employs hyperedge-level random walks, followed by skip-gram for node embedding learning. 2022 TNNLS code
DyGNN Streaming graph neural networks. A method for learning dynamic graph representation. 2020 SIGIR torch
ROLAND Roland: graph learning framework for dynamic graphs. A method for learning dynamic graph representation. 2022 KDD code
GEAR Learning fair node representations with graph counterfactual fairness. A method for learning representation of fair graphs. 2022 WSDM -