/CTRRecommenderModels

I have surveyed the technology and papers of CTR & Recommender System, and implemented 25 common-used models with Pytorch for reusage. (对工业界学术界的CTR推荐调研并实现25个算法模型,2023)

Primary LanguageJupyter Notebook

CTRRecommenderModels (ongoing)

1.最新经验总结和前沿研究调研

对学术界和工业界的推荐系统进行了系统性总结,形成了《特征工程》、《召回》和《排序》三个章节,包括技术要点和前沿研究。

1.1 搜广推之《特征工程》前沿论文:

Multi-modal Representation Learning for Short Video Understanding and Recommendation. ICME Workshops 2019.

An Embedding Learning Framework for Numerical Features in CTR Prediction, KDD 2021.

Dynamic Explicit Embedding Representation for Numerical Features in Deep CTR Prediction, CIKM 2022.

Numerical Feature Representation with Hybrid 𝑁 -ary Encoding, CIKM 2022.

AutoFeature: Searching for Feature Interactions and Their Architectures for Click-through Rate Prediction, CIKM 2020.

Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction, KDD 2020.

AutoGroup: Automatic Feature Grouping for Modelling Explicit High-Order Feature Interactions in CTR Prediction, SIGIR 2020.

Cognitive Evolutionary Search to Select Feature Interactions for Click-Through Rate Prediction, KDD 2023.

AdnFM: An Attentive DenseNet based Factorization Machine for Click-Through-Rate Prediction, ICCDE 2022.

CAN:Feature Co-Action Network for Click-Through Rate Prediction, WSDM 2022.

Enhancing Explicit and Implicit Feature Interactions via Information Sharing for Parallel Deep CTR Models , DLP-KDD 2021.

FINAL: Factorized Interaction Layer for CTR Prediction, SIGIR 2023.

AdaFS: Adaptive Feature Selection in Deep Recommender System, KDD 2022.

LPFS:Learnable Polarizing Feature Selection for Click-Through Rate Prediction, 2022.

Optimizing Feature Set for Click-Through Rate Prediction, WWW 2023.

Automatic Feature Selection By One-Shot Neural Architecture Search In Recommendation Systems, WWW 2023.

Catch: Collaborative Feature Set Search for Automated Feature Engineering, WWW 2023.

经验总结:https://blog.csdn.net/nihaomafb/article/details/133242598

1.2. 推荐系统之《召回》前沿论文

Large Scale Product Graph Construction for Recommendation in E-commerce, 2020.

KGAT: Knowledge Graph Attention Network for Recommendation, KDD 2019.

Multi-Interest Network with Dynamic Routing for Recommendation at Tmall, 2019.

Controllable Multi-Interest Framework for Recommendation, KDD 2019.

Sparse-Interest Network for Sequential Recommendation, WSDM 2021.

Multi-task Learning Model based on Multiple Characteristics and Multiple Interests for CTR prediction, 2022.

SDM: Sequential Deep Matching Model for Online Large-scale Recommender System, CIKM 2019.

Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction, 2020.

End-to-End User Behavior Retrieval in Click-Through Rate Prediction Model, 2021.

Learning Tree-based Deep Model for Recommender Systems, KDD 2019.

DemiNet: Dependency-Aware Multi-Interest Network with Self-Supervised Graph Learning for Click-Through Rate Prediction, AAAI 2022.

MISS: Multi-Interest Self-Supervised Learning Framework for Click-Through Rate Prediction, ICDE 2022.

Path-based Deep Network for candidate item matching in recommenders, SIGIR 2021.

Sampling-bias-corrected neural modeling for large corpus item recommendations, RS 2019.

经验总结:https://blog.csdn.net/nihaomafb/article/details/133249562

1.3. 推荐系统之《排序》前沿论文:

A Survey on User Behavior Modeling in Recommender Systems, 2023.

Deep interest network for click-through rate prediction, KDD,2018.

DIEN: Deep Interest Evolution Network for Click-Through Rate Prediction, AAAI 2018.

SASRec: Self-attentive Sequential Recommendation, ICDM 2018.

BSTransformer: Behavior Sequence Transformer for E-commerce Recommendation in Alibaba, 2019.

Deep Session Interest Network for Click-Through Rate Prediction, IJCAI 2019.

Learning to Retrieve User Behaviors for Click-through Rate Estimation, TIOS 2023.

A Survey on User Behavior Modeling in Recommender Systems, 2023.

Practice on long sequential user behavior modeling for click-through rate prediction, KDD 2019.

Lifelong sequential modeling with personalized memorization for user response prediction, SIGIR 2019.

Sparse Attentive Memory Network for Click-through Rate Prediction with Long Sequences, CIKM 2022.

Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction, 2020.

End-to-End User Behavior Retrieval in Click-Through Rate Prediction Model, 2021.

Sampling Is All You Need on Modeling Long-Term User Behaviors for CTR Prediction, CIKM 2022.

Adversarial Filtering Modeling on Long-term User Behavior Sequences for Click-Through Rate Prediction, SIGIR 2022.

TWIN: TWo-stage Interest Network for Lifelong User Behavior Modeling in CTR Prediction at Kuaishou, KDD 2023.

Divide and Conquer: Towards Better Embedding-based Retrieval for Recommender Systems from a Multi-task Perspective, WWW 2023.

Denoising Self-Attentive Sequential Recommendation, RS 2022.

Hierarchically Fusing Long and Short-Term User Interests for Click-Through Rate Prediction in Product Search, CIKM 2022.

Rethinking Personalized Ranking at Pinterest: An End-to-End Approach, RS 2022.

Page-Wise Personalized Recommendations in an Industrial e-Commerce Setting, RS 2022.

MTBRN: MultiplexTarget-BehaviorRelationEnhancedNetwork forClick-ThroughRatePrediction, CIKM 2020.

Multi-Scale User Behavior Network for Entire Space Multi-Task Learning, CIKM 2022.

Dynamic Multi-Behavior Sequence Modeling for Next Item Recommendation, AAAI 2023.

Hierarchical Projection Enhanced Multi-behavior Recommendation, KDD 2023.

Beyond Matching: Modeling Two-Sided Multi-Behavioral Sequences for Dynamic Person-Job Fit, DASFAA 2021.

Deep Position-wise Interaction Network for CTR Prediction, SIGIR 2021.

AutoDebias: Learning to Debias for Recommendation, SIGIR 2021.

Unbiased Learning to Rank: Online or Offline?, TIOS 2020.

Fair pairwise learning to rank, 2020.

CAM2: Conformity-Aware Multi-Task Ranking Model for Large-Scale Recommender Systems, WWW 2023.

Entire Space Cascade Delayed Feedback Modeling for Effective Conversion Rate Prediction, CIKM 2023.

ESMC: Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint, 2023.

Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce, SIGIR 2023.

DCMT: A Direct Entire-Space Causal Multi-Task Framework for Post-Click Conversion Estimation, ICDE 2023.

Scenario-Adaptive Feature Interaction for Click-Through Rate Prediction, KDD 2023.

OptMSM: Optimizing Multi-Scenario Modeling for Click-Through Rate Prediction, 2023.

Leaving No One Behind: A Multi-Scenario Multi-Task Meta Learning Approach for Advertiser Modeling, WSDM 2022.

M5: Multi-Modal Multi-Interest Multi-Scenario Matching for Over-the-Top Recommendation, KDD 2023.

Automatic Expert Selection for Multi-Scenario and Multi-Task Search, SIGIR 2022.

Continual Transfer Learning for Cross-Domain Click-Through Rate Prediction at Taobao, WWW 2023.

Cross-domain Augmentation Networks for Click-Through Rate Prediction, 2023.

One Model to Serve All: Star Topology Adaptive Recommender for Multi-Domain CTR Prediction, CIKM 2021.

HiNet: Novel Multi-Scenario & Multi-Task Learning with Hierarchical Information Extraction, ICDE 2023.

Multi-Faceted Hierarchical Multi-Task Learning for Recommender Systems, CIKM 2022.

Large Scale Product Graph Construction for Recommendation in E-commerce, 2020.

Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations, RS 2020.

AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations, KDD 2023.

Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate, SIGIR 2018.

Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising, KDD 2021.

Advances and Challenges of Multi-task Learning Method in Recommender System: A Survey, 2023.

Multi-Objective Recommender Systems: Survey and Challenges, RS 2022.

Optimizing Airbnb Search Journey with Multi-task Learning, KDD 2023.

A Contrastive Sharing Model for Multi-Task Recommendation, WWW 2022.

Adaptive Pattern Extraction Multi-Task Learning for Multi-Step Conversion Estimations, 2023.

MSSM: A Multiple-level Sparse Sharing Model for Efficient Multi-Task Learning, SIGIR 2021.

STEM: Unleashing the Power of Embeddings for Multi-task Recommendation, 2023.

Multi-Task Recommendations with Reinforcement Learning, WWW 2023.

Hierarchically Modeling Micro and Macro Behaviors via Multi-Task Learning for Conversion Rate Prediction, SIGIR 2021.

MNCM: Multi-level Network Cascades Model for Multi-Task Learning, CIKM 2022.

Prototype Feature Extraction for Multi-task Learning, WWW 2022.

Fast greedy map inference for determinantal point process to improve recommendation diversity, NIPS 2018.

Neural Re-ranking in Multi-stage Recommender Systems: A Review, 2022.

Generative Flow Network for Listwise Recommendation, KDD 2023.

Context-aware Reranking with Utility Maximization for Recommendation, 2022.

Revisit Recommender System in the Permutation Prospective, 2021.

Entire Cost Enhanced Multi-Task Model for Online-to-Offline Conversion Rate Prediction, 2022.

GRN: Generative Rerank Network for Context-wise Recommendation, 2021.

PEAR: Personalized Re-ranking with Contextualized Transformer for Recommendation, WWW 2022.

Personalized Diversification for Neural Re-ranking in Recommendation, ICDE 2023.

Multi-Level Interaction Reranking with User Behavior History, SIGIR 2022.

Slate-Aware Ranking for Recommendation, WSDM 2023.

RankFormer: Listwise Learning-to-Rank Using Listwide Labels, kdd 2023.

PIER: Permutation-Level Interest-Based End-to-End Re-ranking Framework in E-commerce, KDD 2023.

Multi-factor Sequential Re-ranking with Perception-Aware Diversification, KDD 2023.

APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction, NIPS 2022.

AutoFAS: Automatic Feature and Architecture Selection for Pre-Ranking System, 2022.

NAS-CTR: Efficient Neural Architecture Search for Click-Through Rate Prediction, SIGIR 2022.

Controllable Multi-Objective Re-ranking with Policy Hypernetworks, KDD 2023.

Improving Training Stability for Multitask Ranking Models in Recommender Systems, KDD 2023.

Iterative Boosting Deep Neural Networks for Predicting Click-Through Rate, 2020.

DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction, KDD 2022.

AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction, 2022.

Multi-Task Deep Recommender Systems: A Survey, 2023.

Expressive user embedding from churn and recommendation multi-task learning, WWW 2023.

PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information, KDD 2023.

2.我实现25个推荐CTR经典模型代码(开箱即用-你需要进一步调优,我的运行环境是mac m1 + python 3.9,所有代码都本地完成测试),这个库后续继续更新;

I have implemented some common-used CTR / recommender models for reusage, including 25 models as follows:

2.1. 4个常用机器学习集成模型:随机森林、Xgboost、lightgbm和catboost,以及使用hyperopt和bayesian-optimization进行超参数调优。(这部分基于sklearn包和相应python包实现调用)

2.2. 5个基础模型:Matrix Factorizatin (MF)、SVD、Factorization Machine(FM)、NeuralCF(WWW 2017)、AutoencoderRec(WWW 2015)。

2.3. 8个深度网络模型:DeepFM(IJCAI 2017)、DSSM(CIKM 2013)、Wide & Deep(RS 2016)、DeepCross(DCN,KDD 2016)、Attentive Factorization Machine(AFM,IJCAI 2017)、Product-based Neural Network(PNN,ICDM 2016)、Neural Factorization Machine(NFM,SIGIR 2017)、FiBiNET(RS 2019)。

2.4. 5个序列推荐模型:GRU4Rec(ICLR 2016)、Deep Interest Network(DIN,KDD 2018)、Deep Interest Evolution Network(DIEN,AAAI 2018)、Self-attentive Sequential Recommendation(SASRec,ICDM 2018)、Behavior Sequence Transformer(BSTransformer,2019)。

2.5. 3个多兴趣偏好模型:Multi-interest network with dynamic routing(MIND,2019)、Controllable Multi-Interest Framework for Recommendation(Comirec,KDD 2020)、Sparse-Interest Network(SINE,WSDM 2021)。

解决一个用户兴趣向量很难捕获用户多方面兴趣的问题(特别是从用户历史长行为序列中捕捉多方面兴趣偏好),从用户历史行为序列中得到多个兴趣偏好。当用户历史行为序列较短时(<50)可以采用各种常规序列模型(如GRU、attention序列模型之类),当用户历史行为序列较长时,需要考虑效率,如利用target item来检索相似相近的历史items并进行序列建模。建模用户多方面偏好类似于聚类效果,采用胶囊网络、多个选择通路(如top-k个激活兴趣)等等手段,每次激活一个通路或多个通路(即兴趣点),采用hard方式或者soft方式(如注意力)。

2.6. 4个多任务学习模型:Entire-space multi-task model(ESSM,SIGIR 2020)、Multi-gate MoE Mixture-of-Experts(MMOE,KDD 2018)、Customized Gate Control(CGC,RS 2020)、Audience Multi-step Conversions with Multi-task Learning(AITM,KDD 2021)。

多任务学习一般在实际工作中效果提升比较大的一种方式,找到场景下一些相关的任务,基于任务之间的关系特点来设计多任务共享结构,这里面有很多设计的空间,如共享模块可以是底层embedding共享、中间层共享或高层共享,共享程度大小等等,不同任务loss之间的比例,还有提高采样的效率等。共享的方式是hard还是soft等。注意一点就是根据不同任务之间相关性强弱,设计合理架构,避免负迁移。

根据这几年大厂论文,主要集中在挖掘用户超长行为序列(同时考虑效率和效益,用于精排)、多兴趣偏好(用于召回)、多任务学习(模型sharing结构设计,主要用于精排)等,特征工程(特征离散化和特征交互)的文章相对较少。

推荐模型-演化

AutoRec - Autoencoders Meet Collaborative Filtering, WWW 2015.

Factorization Machines: Fast Context-aware Recommendations with Factorization Machines, SIGIR 2011.

DSSM: Learning deep structured semantic models for web search using clickthrough data, CIKM 2013.

DSSM

NeuralCF: Neural Collaborative Filtering, WWW 2017.

NeuralCF

Wide&Deep: Wide & deep learning for recommender systems, RS 2016.

Wide Deep

DeepFM: Deepfm: a factorization-machine based neural network for ctr prediction, IJCAI 2017.

DeepFM

DeepCross: Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features, KDD 2016.

DeepCross DCN模型

AFM: Attentive Factorization Machine, IJCAI 2017.

AFM-2

NFM: Neural Factorization Machines for Sparse Predictive Analytics, SIGIR 2017.

NFM

FiBiNET: FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction, RS 2019.

FiBiNET

PNN: Product-based Neural Networks for User Response Prediction, ICDM 2016.

PNN

GRU4Rec: Session-based Recommendations with Recurrent Neural Networks, ICLR 2016.

GRU4REC

Caser:Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding, WSDM 2018.

Caser

DIN: Deep Interest Network for Click-Through Rate Prediction, KDD 2018.

DIN

DIEN: Deep Interest Evolution Network for Click-Through Rate Prediction, AAAI 2018.

DIEN

SASRec: Self-attentive Sequential Recommendation, ICDM 2018.

SASRec

BSTransformer: Behavior Sequence Transformer for E-commerce Recommendation in Alibaba, 2019.

BST

MIND:Multi-interest network with dynamic routing for recommendation at Tmall, 2019.

MIND

Comirec:Controllable Multi-Interest Framework for Recommendation, KDD 2020.

Comirec

SINE: Sparse-Interest Network for Sequential Recommendation, WSDM 2021.

SINE

ESSM:Entire Space Multi-task Modeling via Post-Click Behavior Decomposition for Conversion Rate Prediction, SIGIR 2020.

ESSM

MMOE:Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts, KDD 2018.

MMOE

CGC:Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations, RS 2020.

CGC

AITM: Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning in Targeted Display Advertising, KDD 2021.

AITM

The project is ongoing ......