alexa/bort

Repository for the paper "Optimal Subarchitecture Extraction for BERT"

PythonApache-2.0

Issues

The exact English pretraining data and Chinese pretraining data that are exact same to the BERT paper's pretraining data.
#11 opened 3 years ago by guotong1988
1
Pre-training-Using-Knowledge-Distillation is better than Pre-training-Only for downstream tasks?
#10 opened 3 years ago by guotong1988
2
Mask-Filling with pretrained BORT
#9 opened 4 years ago by patrickvonplaten
3
how to train model on another language?
#8 opened 4 years ago by Archelunch
1
Huggingface support
#4 opened 4 years ago by sbsky
2
Create pretraining data with multiprocessing not Implemented
#7 opened 4 years ago by 7AM7
1
bort pretrain
#6 opened 4 years ago by nicexw
5
I couldn't understand the configuration of the model. please can someone clarify?
#3 opened 4 years ago by preethamgali
1
Can't download model.
#1 opened 4 years ago by hardfish82
1
Can't download model again!
#5 opened 4 years ago by killua-zyk
1
Accuracy during fine-tuning is very low (only 0.68）
#2 opened 4 years ago by waugustus
1