RecBole-PJF

RecBole-PJF is a library built upon PyTorch and RecBole for reproducing and developing person-job fit algorithms. Our library includes algorithms covering three major categories:

CF-based Model make recommendations based on collaborative filtering;
Content-based Model make recommendations mainly based on text matching;
Hybrid Model make recommendations based on both interaction and content.

Highlights

Unified framework for different methods, including collaborative methods, content-based methods and hybrid methods;
Evaluate from two perspective for both candidates and employers, which is not contained in previous frameworks;
Easy to extend models for person-job fit, as we provide multiple input interfaces for both interaction and text data. And our library shares unified API and input (atomic files) as RecBole.

Requirements

recbole>=1.0.0
pytorch>=1.7.0
python>=3.7.0

Implemented Models

We list currently supported models according to category:

CF-based Model:(take follows as example, as these models are implement in RecBole and we just use them)

BPR from Steffen Rendle et al.: BPR Bayesian Personalized Ranking from Implicit Feedback.
NeuMF from He et al.: Neural Collaborative Filtering (WWW 2017).
LightGCN from He et al.: LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation (SIGIR 2020).
LFRR from Neve et al.:Latent factor models and aggregation operators for collaborative filtering in reciprocal recommender systems (RecSys 2019).

Content-based Model:

PJFNN from Zhu et al.: Person-job fit: Adapting the right talent for the right job with joint representation learning (TMIS 2018)
BPJFNN from Qin et al.: Enhancing person-job fit for talent recruitment: An ability-aware neural network approach (SIGIR 2018)
APJFNN from Qin et al.: Enhancing person-job fit for talent recruitment: An ability-aware neural network approach (SIGIR 2018)
BERT: a twin tower model with a text encoder using BERT.

Hybrid Model:

IPJF from Le et al.: Towards effective and interpretable person-job fitting (CIKM 2019).
PJFFF from Jiang et al.: Learning Effective Representations for Person-Job Fit by Feature Fusion (CIKM 2020).
SHPJF from Hou et al.: Leveraging Search History for Improving Person-Job Fit (DASFAA 2022).

Dataset and Quick-Start

zhilian from TIANCHI data contest.
kaggle from kaggle Job Recommendation Case Study.-Start

We provide processing scripts in the corresponding folder (e.g. /dataset/zhilian/) and if you want to run experiments with these two datasets, the first step is to download the source files and then run the processing script, converting it to atomic files. The script is as following (take zhilian for example):

cd dataset/zhilian
python prepare_zhilian.py

With the source code, you can use the provided script for initial usage of our library:

python run_recbole_pjf.py

If you want to change the models or datasets, just run the script by setting additional command parameters:

python run_recbole_pjf.py -m [model] -d [dataset]

Hyper-tuning

We tune the hyper-parameters of the implemented models of each category and release the adjustment range for reference:

For fair comparison, we set embedding_size to 128 for all models and tune other parameters.

zhilian

model	Best Parameter	Parameter Range
BPRMF	learning_rate = 1e-3	learning_rate in [1e-3, 1e-4, 1e-5]
NCF	learning_rate = 1e-3, map_hidden_size = [64]	learning_rate in [1e-3, 1e-4, 1e-5], mlp_hidden_size in [[64], [64, 32], [64, 32, 16]]
LightGCN	learning_rate = 1e-4, n_layers = 3	learning_rate in [1e-3, 1e-4, 1e-5], n_layers in [2, 3, 4]
LFRR	learning_rate = 1e-4	learning_rate in [1e-3, 1e-4, 1e-5]
BERT	learning_rate = 1e-3	learning_rate in [1e-3, 1e-4, 1e-5]
PJFNN	learning_rate = 1e-3, max_sent_num = 20, max_sent_len = 20	learning_rate in [1e-3, 1e-4, 1e-5], max_sent_num in [10, 20, 30], max_sent_len in [10, 20, 30]
BPJFNN	learning_rate = 1e-3, max_sent_num = 20, max_sent_len = 20, hidden_size = 64	learning_rate in [1e-3, 1e-4, 1e-5], max_sent_num in [10, 20, 30], max_sent_len in [10, 20, 30], hidden_size in [64, 32]
APJFNN	learning_rate = 1e-3, num_layers = 1, hidden_size = 32	learning_rate in [1e-3, 1e-4, 1e-5], num_layers in [1, 2], hidden_size in [32, 64]
PJFFF-BERT	learning_rate = 1e-4, hidden_size = 32, history_item_len = 20	learning_rate in [1e-3, 1e-4, 1e-5], hidden_size in [32, 64], history_item_len in [20, 50]
IPJF-BERT	learning_rate = 1e-3, max_sent_num = 20, max_sent_len = 30,	learning_rate in [1e-3, 1e-4, 1e-5], max_sent_num in [10, 20, 30], max_sent_len in [10, 20, 30]

The Team

RecBole-PJF is developed and maintained by members from RUCAIBox, the main developers are Chen Yang (@flust), Yupeng Hou (@hyp1231), Shuqing Bian (@fancybian).

Acknowledgement

The implementation is based on the open-source recommendation library RecBole.

Please cite the following paper as the reference if you use our code or processed datasets.

@inproceedings{zhao2021recbole,
  title={Recbole: Towards a unified, comprehensive and efficient framework for recommendation algorithms},
  author={Wayne Xin Zhao and Shanlei Mu and Yupeng Hou and Zihan Lin and Kaiyuan Li and Yushuo Chen and Yujie Lu and Hui Wang and Changxin Tian and Xingyu Pan and Yingqian Min and Zhichao Feng and Xinyan Fan and Xu Chen and Pengfei Wang and Wendi Ji and Yaliang Li and Xiaoling Wang and Ji-Rong Wen},
  booktitle={{CIKM}},
  year={2021}
}

poolarrr/RecBole-PJF