IntelLabs/academic-budget-bert

Repository containing code for "How to Train BERT with an Academic Budget" paper

PythonApache-2.0

Issues

Grad overflow and null validation loss
#33 opened 2 years ago by NewDriverLee
0
Which vocabulary file need to use?
#32 opened 2 years ago by NewDriverLee
4
Which kind of optimization you use from DeepSpeed
#29 opened 2 years ago by jzhang38
1
The file produced by process_data.py is empty
#28 opened 2 years ago by Richar-Du
0
The training process will get stuck after training for one epoch
#26 opened 2 years ago by leoozy
10
What is the size of the processed data？
#24 opened 3 years ago by leoozy
1
the eval_acc on RTE dataset is only 55%
#27 opened 2 years ago by leoozy
1
GLUE results not reproducible
#18 opened 3 years ago by lumliolum
11
Distributed pretraining dataset question
#22 opened 3 years ago by sangmichaelxie
3
Finetuning commands for other glue tasks
#25 opened 2 years ago by raghavlite
1
Unable to train a roberta model?
#8 opened 3 years ago by dseddah
10
only test_shard_*.hdf5
#21 opened 3 years ago by shizhediao
1
GLUE dev results
#17 opened 3 years ago by BaohaoLiao
1
Question: Easiest way to load deepspeed checkpoints as standard PyTorch models?
#16 opened 3 years ago by QuintinPope
1
How to combine wiki and bookcorpus into one file?
#20 opened 3 years ago by shizhediao
4
bert_model not used
#15 opened 3 years ago by senisioi
2
Which versions for pre-training?
#14 opened 3 years ago by marcelbra
1
Clarification on "sparse token prediction" or "sparse output prediction"
#13 opened 3 years ago by mirandrom
2
Unable to run_glue
#10 opened 3 years ago by Rotendahl
0
Question about validation and testing
#5 opened 3 years ago by peerdavid
6
Any plan for releasing the checkpoints?
#2 opened 3 years ago by gaotianyu1350
2
Code release date?
#1 opened 3 years ago by RyanHuangNLP
2