some problems about VHRED

Question

some problems about VHRED

Closed this issue 7 years ago · 2 comments

Hello, recently I tried to train VHRED model using the ubuntu datasets and I found the cost_mean is always about 3.5 for two days and the response is not good. Today I found "patience = -1" and the process exits. In my opinion, that is because valid_cost is bad.
I want to learn more about some details about your training process and can you share some experience with me ? I would appreciate it and thanks .
the source code in https://github.com/julianser/hed-dlg-truncated is referred.

Answer 1 · 2017-07-20T09:49:48.000Z

Hi, you have trained it on Theano or re-implemented it in Tensorflow ?
The cost_mean is not the only the measure to see if it's good or not. Have you compared measures reported in the paper ? And how do you know the response is good or not ? One sample response (in the debug mode) barely say anything. We should take a look at the whole output data (compare output of LSTM baseline and VHRED).
After all, did you observe any errors when running on Theano (like SeqOptimizer apply PushOutScanOutput error for ex) ?

Answer 2 · 2017-07-20T12:15:12.000Z

I trained it on theano. I didn't find the error.
I tried once but it exited only two days later. The model is saved only twice . and the log has "patience = -1"
I tried again and now it is a little better.