SajjadPSavoji/Language_Modelling_LSTM

Language Modelling using LSTM

Jupyter Notebook

Long-Short-Term-Memory for Sentence Generation

In this project, a sentence generator is developed using LSTM modules. To optimize the final model, there are several aspects that can be optimized. Among those aspects, Hidden Dimension, Number of Layers, Embedding Dim, and Learning Rate was studied.

Case Study

In case studies other parameters of the model was fixed to ensure accurate comparision.

Model name	H10	H100	S1	S5	E50	E200	LR0.1	LR0.01
Embedding Size	50	50	50	50	50	200	50	50
Hidden Layer Size	10	100	10	10	10	10	10	10
Number of Layers	2	2	1	5	2	2	2	2
Batch Size	256	256	256	256	256	256	256	256
Epochs	50	50	50	50	50	50	50	50
Learning Rate	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.01

Hidden Dimension

perplexity reduction speed is the same for both models
H10 converges sooner than H100
H100 has a better performance on the test split compared with H10
H100 generates more accurate sentences compared with H10

Number of Layers

perplexity reduction speed is NOT the same for both models
E50 has a better convergence properties compared with E200
E200 has a better performance on the test split compared with E50
E50 generates more accurate sentences compared with E200

Embedding Dimension

perplexity reduction speed is the same for both models
convergance speed is the same for both models
S1 has a better performance on the test split compared with S5
S1 generates more accurate sentences compared with S5

Learning Rate

perplexity reduction speed is more for LR0.1 compared with LR0.01
LR0.1 has a better convergence properties compared with LR0.01
LR0.1 has a better performance on the test split compared with LR0.01
LR0.1 generates more accurate sentences compared with LR0.01

Resources