Why Train_Accuracy is pretty low(about 0.2) ?
alphaRGB opened this issue · 1 comments
alphaRGB commented
I follow the "README.md" trained the model on sample data
First: python pre_process.py
Second: python train_gpt2.py --num-layers=8 --embedding-size=768 --batch-size=32
Then, the training beigins, here us the Loss and Accurancy during training
eprecated and will be removed after 2020-07-01.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.
Saving checkpoint for step 0 at xxx/GPT_2/GPT_tf/TF2/gpt-2-tensorflow2.0/model/ckpt-1
Step 10 Train_Loss 7.2324 Train_Accuracy 0.0832
Step 20 Train_Loss 6.5299 Train_Accuracy 0.1730
Step 30 Train_Loss 6.4850 Train_Accuracy 0.1768
Step 40 Train_Loss 6.1244 Train_Accuracy 0.1932
Step 50 Train_Loss 6.3007 Train_Accuracy 0.1790
Step 60 Train_Loss 6.3144 Train_Accuracy 0.1865
Step 70 Train_Loss 6.1924 Train_Accuracy 0.1648
Step 80 Train_Loss 6.2282 Train_Accuracy 0.1759
Step 90 Train_Loss 6.2466 Train_Accuracy 0.1744
Step 100 Train_Loss 6.1871 Train_Accuracy 0.1795
Step 110 Train_Loss 6.0732 Train_Accuracy 0.2064
Step 120 Train_Loss 5.7407 Train_Accuracy 0.2119
Step 130 Train_Loss 5.8436 Train_Accuracy 0.2077
Step 140 Train_Loss 5.7919 Train_Accuracy 0.1898
Step 150 Train_Loss 5.9080 Train_Accuracy 0.1661
Step 160 Train_Loss 5.8630 Train_Accuracy 0.1994
Step 170 Train_Loss 5.7625 Train_Accuracy 0.2076
---
Step 2740 Train_Loss 5.3913 Train_Accuracy 0.1958
Step 2750 Train_Loss 5.3359 Train_Accuracy 0.2195
Step 2760 Train_Loss 5.3394 Train_Accuracy 0.1973
Step 2770 Train_Loss 5.0865 Train_Accuracy 0.2501
Step 2780 Train_Loss 5.4709 Train_Accuracy 0.1929
Step 2790 Train_Loss 5.4672 Train_Accuracy 0.1946
Step 2800 Train_Loss 5.5116 Train_Accuracy 0.1962
Step 2810 Train_Loss 5.2981 Train_Accuracy 0.2346
Step 2820 Train_Loss 5.4803 Train_Accuracy 0.2078
Step 2830 Train_Loss 5.5752 Train_Accuracy 0.1869
Step 2840 Train_Loss 5.4528 Train_Accuracy 0.2158
Step 2850 Train_Loss 5.2158 Train_Accuracy 0.2377
Step 2860 Train_Loss 5.3771 Train_Accuracy 0.2202
Step 2870 Train_Loss 5.3635 Train_Accuracy 0.1965
Step 2880 Train_Loss 5.4944 Train_Accuracy 0.2296
Step 2890 Train_Loss 5.4714 Train_Accuracy 0.2068
Step 2900 Train_Loss 5.2218 Train_Accuracy 0.2330
Step 2910 Train_Loss 5.4696 Train_Accuracy 0.2070
Step 2920 Train_Loss 5.5928 Train_Accuracy 0.1947
Step 2930 Train_Loss 5.4761 Train_Accuracy 0.2173
Step 2940 Train_Loss 5.5963 Train_Accuracy 0.2022
Step 2950 Train_Loss 5.3133 Train_Accuracy 0.2197
Training Done................
akanyaani commented
Hi @alphaRGB,
Gpt2 is an autoregressive language model so accuracy is not a good metric, I have removed the accuracy and added the perplexity as a metric.
https://thegradient.pub/understanding-evaluation-metrics-for-language-models/
https://towardsdatascience.com/perplexity-in-language-models-87a196019a94