House-BEKE: A Python repository from CCChenhao997

房产行业聊天问答匹配

A榜 48/2985

Tricks

Methods

Bert
Double bert + LSTM (Double bert model query and reply respectively, then a BiLSTM model the relation between them)
LCF [link]
Semantic Role Labeling
Auxiliary task: Identify query-reply or reply-query
Bert+GCN

Pre-training language models

Data processing

Papers

SemBERT [link]
Explicit Contextual Semantics for Text Comprehension [link]

Analysis

Prediction results of dev set
Prediction results of test set

Submission history

0.77308091 | bert_spc | ERNIE | Search_f1 | StratifiedKFold 5-fold voting | BCE loss

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE
0.77179287015 | bert_spc | ERNIE | Search_f1 | StratifiedKFold 5-fold voting | GHMC loss

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE --criterion ghmc
0.77587103484 | bert_spc | ERNIE | Search_f1 | StratifiedKFold 5-fold voting | BCE loss | datareverse

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE --datareverse
0.77975341 | bert_spc | ERNIE | Search_f1 | StratifiedKFold 5-fold voting | BCE loss | FGM

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 4 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE --attack_type fgm --scheduler
0.77839228296 | bert_spc | ERNIE | Search_f1 | StratifiedKFold 5-fold voting | BCE loss | PGD

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 4 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE --attack_type pgd --scheduler
0.78033390298 | bert_spc | ERNIE | Search_f1 | StratifiedKFold 7-fold voting | BCE loss | FGM

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE --attack_type fgm --scheduler --cross_val_fold 7
0.78855493 | bert_spc | ERNIE-TAPT | Search_f1 | StratifiedKFold 7-fold voting | BCE loss | FGM

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE-TAPT --attack_type fgm --scheduler --cross_val_fold 7
0.78575129534 | bert_spc | ERNIE-TAPT | Search_f1 | StratifiedKFold 7-fold voting | BCE loss | FGM | datareverse

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 1 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE-TAPT --attack_type fgm --scheduler --cross_val_fold 7 --datareverse
0.77677520596 | bert_spc | ERNIE-TAPT | Search_f1 | BCE loss | FGM | order_predict

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 1 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE-TAPT --attack_type fgm --scheduler --cross_val_fold 5 --order_predict
0.78326013950 | bert_cap | ERNIE-TAPT | Search_f1 | BCE loss | FGM | batchsize=8 | diff_lr

python train.py --model_name bert_cap --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE-TAPT --attack_type fgm --scheduler --train_batch_size 8 --diff_lr
0.78496868476 | bert_spc | ERNIE-TAPT | Search_f1 | StratifiedKFold 7-fold voting | BCE loss | FGM | batchsize=32

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 3 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE-TAPT --attack_type fgm --scheduler --cross_val_fold 7 --train_batch_size 32
0.79014267185 | bert_spc | ERNIE-ALL-TAPT | Search_f1 | GroupKFold 7-fold voting | BCE loss | FGM

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 4 --max_length 100 --cuda 3 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE-ALL-TAPT --attack_type fgm --scheduler --cross_val_fold 7 --cv_type GroupKFold
0.79081172 | bert_spc | ERNIE-ALL-TAPT | Search_f1 | GroupKFold 7-fold voting | GHMC loss | FGM

python train.py --model_name bert_spc --seed 1000 --bert_lr 2e-5 --num_epoch 4 --max_length 100 --cuda 2 --notsavemodel --log_step 20 --pretrained_bert_name ./pretrain_models/ERNIE-ALL-TAPT --attack_type fgm --scheduler --cross_val_fold 7 --cv_type GroupKFold --criterion ghmc

Licence

MIT

CCChenhao997/House-BEKE

房产行业聊天问答匹配

A榜 48/2985

Tricks

Methods

Pre-training language models

Data processing

Papers

Analysis

Submission history

Licence