IAM dataset currently on 267 epoch with ~6.2% cer ...
miliadis opened this issue · 19 comments
Hi @jpuigcerver ,
I train IAM dataset following the README instructions https://github.com/jpuigcerver/Laia/tree/master/egs/iam . I am currently on ~267 epoch with ~6.2% cer on val set. Since, the readme file says that the 3.8% cer will be reached at 80 epoch, just wondering if there is any change that I am not aware of.
IAM dataset: 6176 training samples, 976 val samples
Hi @miliadis
Thanks for pointing out the problem.
I just got a fresh clone of the current Laia version and I'm trying to reproduce the experiment to see in which point diverges from my previous run. I'll come back to you once I've found where the problem is.
By the way, the README does not say that you should get 3.8% CER after 80 epochs. You'll achieve approximately that CER when the validation error stops improving for 80 epochs. In my case, the complete training took 514 epochs.
Anyhow, I did found a bug in the train_lstm1d.sh script: the dropout in the convolutional layers is not activated.
In any case, the CER that you are getting is very high. In my run I got <=6.2% CER on the validation set after 36 epochs. I'll keep looking into it.
I also uploaded my models for IAM and Rimes, and the logs for IAM.
Hey @jpuigcerver
Got it, thanks for clarifying this about the 80 epochs. So, this is what I get with ~36 epochs:
I am training now with dropout=0.2 and we will see....but it's strange that you were getting ~6.2 without dropout at 36 epochs...
FYI, I just committed a change to use the correct dropout values in the CNNs: See commit c6bb8ab.
I also see the same issue and got CER of 5.98% after 469 epochs (before your latest dropout commit). Not sure if it is related but the default preprocessing seems to be overly aggressive sometimes (on my system the preprocessed C04-110-00.png is completely missing the "a" character).
I suspect that the differences are due to a change made in the imgtxtenh tool, on March 13th, 2017 (mauvilsa/imgtxtenh@5cca789). @mauvilsa changed the default units from "mm" to "pixels". I think I have located where the problem is, but I need to check it first.
Since I still have the data processed with the previous version, I am re-training the model from scratch with the current version of Laia and the data processed with the old version of imgtxtenh tool.
I'll come back to you as soon as I get some update.
@miliadis @bdotgradb Could you please modify this line from the prepare_images.sh script:
https://github.com/jpuigcerver/Laia/blob/master/egs/iam/steps/prepare_images.sh#L44
And simply remove the "-u mm" part. You will need to process the images again and start the training from scratch. But I suspect that it will solve the problem, I am trying it myself.
Thanks, I will try this and let you know...
Results of my re-run:
Finished training after 430 epochs. According to "valid_cer" criterion, epoch 350 was the best: duration = 21s ; batches = 61 ; min./max./avg. chunks/batch = 1/1/1.0 ; loss = 0.031728 ; cer = 3.80% ; del = 0.65% ; ins = 0.52% ; sub = 2.64% ; cer_ci = [ 3.52%, 4.08%] ; ci_alpha = 5.000%
and results of decoding:
%CER lines va: 3.90
%WER lines va: 13.52
%CER forms va: 3.78
%WER forms va: 13.50
%CER lines te: 5.77
%WER lines te: 18.17
%CER forms te: 5.62
%WER forms te: 18.13
Why is the validation CER in the training report different from the final decode (3.80% vs 3.90%)?
Hi,
The results are different because the output of decoding is slightly depending on the batch.
During training, we reshuffle the validation set on each epoch. While during decoding the original order of the examples, according to your input file, is used.
Because input images have different sizes, but all are zero padded to have the same size as the largest one, the decoding results may be slightly different based on that.
We always report the results on the separate evaluation step, which processes the files in the original (alphabetic) order.
@miliadis Did you manage to reproduce the results as @bdotgradb did? If so, I'll close this issue. BTW, I updated a few scripts recently to fix some other bugs that I found.
@jpuigcerver my training is not done yet, but definitely I see the improvement (epoch 250 -> cer 4.30%)
Final results: ../../laia-train-ctc:387: Epoch 292, last epoch with a significant improvement on "valid_cer" criterion was 212. Triggering early stop!
[2017-12-05 23:32:08 INFO] ../../laia-train-ctc:401: Finished training after 292 epochs. According to "valid_cer" criterion, epoch 212 was the best: duration = 16s ; batches = 61 ; min./max./avg. chunks/batch = 1/1/1.0 ; loss = 0.029689 ; cer = 4.13% ; del = 0.72% ; ins = 0.60% ; sub = 2.80% ; cer_ci = [ 3.82%, 4.42%] ; ci_alpha = 5.000%
my training finished at 292 epoch and final CER result is 4.13%. This is not exactly 3.80%, but close...so, is this difference acceptable?
@miliadis I would expect a CER closer to what @bdotgradb and I obtained. Could you please upload somewhere your training log?
@jpuigcerver I am training again after your most recent changes
I have committed some changes to the imgtxtenh tool (mauvilsa/imgtxtenh#3). If the same parameters as originally in the script are used (imgtxtenh -u mm -d 118.110) then exactly the same processing as before should be done.
thanks @jpuigcerver and @mauvilsa, was able to reproduce IAM results
Closing the issue now.