jasonwu0731/trade-dst

About the choice of the hyperparameters

Closed this issue · 1 comments

Hi Jason,

When i training for the unseen domain zero-shot, the results on the four domains are not desirable sometimes, e.g. except the attraction domain. Is the model not very stable sometimes and need to tune the hyperparameters? Could you give me some advice for reproducing?

More specifically, i train for 13 epochs except the attraction domain, the joint and turn acc are 9.2 and 87.5. Besides, for the BM excpet hotel, the fine-tuned results on 4 domains and new domain are (18.79 89.13) and (26.8 77.95).

Thanks for your code release!

Hi @ha-lins

Thanks for your interest in our work. For all the models trained on each four-domain setting, we using HDD 400 (glove + char embeddings), batch size 32, and dropout 0.2. In my own experiments, the results are quite similar for each run.

On the other hand, if you are doing fine-tuning with EWC and GEM, you need to be careful about the hyper-parameter (ex: lambda). Hope this help.

Jason