How to determine hidden layer size and learning rate?
wltz opened this issue · 2 comments
wltz commented
python main.py <model_name> <hidden_layer_size> <learning_rate>
I am wondering at first stage of training the language model, how to choose the hidden layer size and learning rate?
Thanks!
ottokart commented
Hidden layer size 256 and learning rate 0.02 have worked fairly well for me in most cases. You can start with that. To find more optimal settings, you'll just have to experiment with different values and compare the results on dev/validation set. For small datasets you might want to reduce the hidden layer size, for larger datasets a bigger model might be better (but also slower).
wltz commented
Thanks ottokart!