loudinthecloud/pytorch-ntm

Strange fluctuation on curves even after large #seqs have been trained with

marcwww opened this issue · 6 comments

7711528080263_ pic_hd
7701528080263_ pic_hd

(random seed=10)
As the plots show, after 120,000 seqs, there still occurs some fluctuation of cost, which seems not to match that of the results in your experiments and the original authors'.
What could probably be the reasons?
How to copy with this?
THANKS A LOT.

Hey, can you please test after reverting d7b3840?

image
image
it seems that change does not helpy

Interesting, perhaps it's related to the seed (initialization and random training samples). Can you please test using a different seed? My test involved averaging 4 different seeds. I'll try to reproduce with seed=10 as well.

It's seems to be a seed issue, I attached plots for the training of the copy task with seed=1000. Results may vary based on the seed as it controls the initialization and the training examples as well (which are random sequences of bits).

Loss

Cost

Could u please list a detailed param setting? Thanks a lot.

Hi, sure.
It appears in the copy notebook, at the beginning.