implement OpenAI gpt
bash run.sh
Yes, just run the shell, it can work.
run.sh will create a virtual env that needed by gpt, and install all
software that needed by gpt.
And how about the pretrained model ,config and dataset that we need
for finetune?
All of them will be download and cached after we run the shell run.sh
In a word, just run the shell:
bash run.sh
Then you get a result.
And if you need,read the code.
Gaussian Error Linear Units
translate to chinese
Attention Is All You Need
translate to chinese
Improving Language Understanding by Generative Pre-Training
translate to chinese
Language Models are Unsupervised Multitask Learners
translate to chinese
use the dataset ROCStories
31.4 GiB
Intel® Core™ i7-8700K CPU @ 3.70GHz × 12
GeForce GTX 1080 Ti/PCIe/SSE2
64-bit
with 40 seconds an epoch for train and 23 seconds an epoch for eval,
so we need less than 3 minutes to get the results below.
show.py line:42 ***** Eval results *****
show.py line:44 eval_accuracy = 0.874933190807055
show.py line:44 eval_loss = 0.432198545569156
show.py line:44 train_loss = 2.201771383611565
with 40 seconds an epoch for train and 23 seconds an epoch for eval,
so we need about 1 minutes to get the results below.
show.py line:42 ***** Eval results *****
show.py line:44 eval_accuracy = 0.863174772848744
show.py line:44 eval_loss = 0.31887995107815814
show.py line:44 train_loss = 3.087455103540013
with 40 seconds an epoch for train and 23 seconds an epoch for eval,
so we need about 7 minutes to get the results below.
show.py line:42 ***** Eval results *****
show.py line:44 eval_accuracy = 0.8786745056119722
show.py line:44 eval_loss = 0.5693538990389142
show.py line:44 train_loss = 1.2477980831749418
with 40 seconds an epoch for train and 23 seconds an epoch for eval,
so we need about 21 minutes to get the results below.
show.py line:42 ***** Eval results *****
show.py line:44 eval_accuracy = 0.8727952966328166
show.py line:44 eval_loss = 0.6764590177271101
show.py line:44 train_loss = 0.23714334345780885
with 23 seconds an epoch for eval,
so we need about half a minutes to get the results below.
show.py line:42 ***** Eval results *****
show.py line:44 eval_accuracy = 0.5611972207375735
show.py line:44 eval_loss = 0.6895335352318919
show.py line:44 train_loss = 0.0