ottokart/punctuator2

How to determine hidden layer size and learning rate?

wltz opened this issue · 2 comments

wltz commented

python main.py <model_name> <hidden_layer_size> <learning_rate>
I am wondering at first stage of training the language model, how to choose the hidden layer size and learning rate?
Thanks!

Hidden layer size 256 and learning rate 0.02 have worked fairly well for me in most cases. You can start with that. To find more optimal settings, you'll just have to experiment with different values and compare the results on dev/validation set. For small datasets you might want to reduce the hidden layer size, for larger datasets a bigger model might be better (but also slower).

wltz commented

Thanks ottokart!