codekansas/keras-language-modeling

What changes are needed to run the CNN model?

wailoktam opened this issue · 2 comments

Hi, I try changing the attention model to cnn without success. I get complaints about shape of input layers. Can you give me some ideas what to fix in order to run the cnn model included?

For CNN model, you should first change the question_len = answer_len. In my experiments, CNN modoel is better enough. The attention (LSTM) model just improve the results a little.

@eshijia i used CNN model,but it epochs 2 twice ,the loss value became nan. it seems getting in local minmize.
image