Issues
- 0
Question about the metric reported in the paper?
#38 opened by dsj96 - 0
- 0
- 0
Samples?
#34 opened by sleepinyourhat - 0
Training 1.5B?
#33 opened by JulesGM - 1
Retraining a new model, only gpu 0 can be used
#32 opened by yds1024 - 1
Error on output
#31 opened by silicahd - 0
GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?
#30 opened by guotong1988 - 0
117M/model.ckpt.index is corrupted?
#29 opened by ksjae - 6
format dataset
#13 opened by khaerulumam42 - 1
character-level
#28 opened by amacfie - 4
about encoder.json
#27 opened by fnyhy - 3
- 0
DOCKER: Web interface doesn't work
#26 opened by fartwhif - 0
Docker documentation for CUDA
#25 opened by fartwhif - 1
- 1
- 1
- 3
error when using create_tfrecords.py
#20 opened by CrackerHax - 2
Your 1.5B model
#19 opened by 4R7I5T - 1
- 2
Downloading Encoder Model fails
#16 opened by PickHub - 1
Training problem
#15 opened by DrYangLiu - 9
Input Chinese, the predicted is Japanese.
#5 opened by dpyneo - 4
quirks that hold the model back
#11 opened by murpen - 2
- 5
Predicting with PrettyBigModel `InvalidArgumentError: indices[0,0] = 1024 is not in [0, 1024)`
#4 opened by pkmital - 1
- 4
- 5
- 2
Has anyone managed to work it on Windows? Which OS did you use to make it work?
#6 opened by FurkanGozukara - 5
- 2
Unable to predict with bfloat16 model
#1 opened by kizinfo