ConnorJL/GPT2

An implementation of training for GPT2, supports TPUs

PythonMIT

Issues

Question about the metric reported in the paper?
#38 opened 2 years ago by dsj96
0
create_tfrecords.py。Dealing with problems with your own data set
#36 opened 3 years ago by xsyzka
0
where is the length of the forecast article set? Thank you!
#35 opened 3 years ago by xsyzka
0
Samples?
#34 opened 4 years ago by sleepinyourhat
0
Training 1.5B?
#33 opened 4 years ago by JulesGM
0
Retraining a new model, only gpu 0 can be used
#32 opened 4 years ago by yds1024
1
Error on output
#31 opened 4 years ago by silicahd
1
GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?
#30 opened 4 years ago by guotong1988
0
117M/model.ckpt.index is corrupted?
#29 opened 4 years ago by ksjae
0
format dataset
#13 opened 6 years ago by khaerulumam42
6
character-level
#28 opened 4 years ago by amacfie
1
about encoder.json
#27 opened 5 years ago by fnyhy
4
I figured out how to cram GPT-2 1.5B onto a single TPU core with Adam optimizer
#23 opened 5 years ago by shawwn
3
DOCKER: Web interface doesn't work
#26 opened 5 years ago by fartwhif
0
Docker documentation for CUDA
#25 opened 5 years ago by fartwhif
0
Training on artificial language data (server logs, medical records, etc.)
#24 opened 5 years ago by klimentij
1
How can i create smaller sized file for inference of 1.5B model
#22 opened 5 years ago by pragnakalpdev6
1
Are there some research papers about text-to-set generation?
#21 opened 5 years ago by guotong1988
1
error when using create_tfrecords.py
#20 opened 5 years ago by CrackerHax
3
Your 1.5B model
#19 opened 5 years ago by 4R7I5T
2
when reading metadata of gs://openwebtext/stuff/encoder/encoder.json
#18 opened 5 years ago by makamkkumar
1
Downloading Encoder Model fails
#16 opened 5 years ago by PickHub
2
Training problem
#15 opened 5 years ago by DrYangLiu
1
Input Chinese, the predicted is Japanese.
#5 opened 6 years ago by dpyneo
9
quirks that hold the model back
#11 opened 6 years ago by murpen
4
Why gpt-2 could apply to other tasks without fine-tune?
#14 opened 6 years ago by guotong1988
2
Predicting with PrettyBigModel `InvalidArgumentError: indices[0,0] = 1024 is not in [0, 1024)`
#4 opened 6 years ago by pkmital
5
what's the difference between sample and sample_free?
#12 opened 6 years ago by Tianyu00
1
To train my model means fit-tuning or retrain a model?
#10 opened 6 years ago by wjy979769265
4
A meaningful performance comparison with OpenAI's models
#3 opened 6 years ago by lostmsu
5
Has anyone managed to work it on Windows? Which OS did you use to make it work?
#6 opened 6 years ago by FurkanGozukara
2
How to process raw text files to create similar "PrettyBig" model?
#2 opened 6 years ago by GenTxt
5
Unable to predict with bfloat16 model
#1 opened 6 years ago by kizinfo
2