/Language_Modelling_LSTM

Language Modelling using LSTM

Primary LanguageJupyter Notebook

Long-Short-Term-Memory for Sentence Generation

In this project, a sentence generator is developed using LSTM modules. To optimize the final model, there are several aspects that can be optimized. Among those aspects, Hidden Dimension, Number of Layers, Embedding Dim, and Learning Rate was studied.

Case Study

In case studies other parameters of the model was fixed to ensure accurate comparision.

Model name H10 H100 S1 S5 E50 E200 LR0.1 LR0.01
Embedding Size 50 50 50 50 50 200 50 50
Hidden Layer Size 10 100 10 10 10 10 10 10
Number of Layers 2 2 1 5 2 2 2 2
Batch Size 256 256 256 256 256 256 256 256
Epochs 50 50 50 50 50 50 50 50
Learning Rate 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.01

Hidden Dimension

  • perplexity reduction speed is the same for both models
  • H10 converges sooner than H100
  • H100 has a better performance on the test split compared with H10
  • H100 generates more accurate sentences compared with H10

Hidden Image

Number of Layers

  • perplexity reduction speed is NOT the same for both models
  • E50 has a better convergence properties compared with E200
  • E200 has a better performance on the test split compared with E50
  • E50 generates more accurate sentences compared with E200

Hidden Image

Embedding Dimension

  • perplexity reduction speed is the same for both models
  • convergance speed is the same for both models
  • S1 has a better performance on the test split compared with S5
  • S1 generates more accurate sentences compared with S5

Hidden Image

Learning Rate

  • perplexity reduction speed is more for LR0.1 compared with LR0.01
  • LR0.1 has a better convergence properties compared with LR0.01
  • LR0.1 has a better performance on the test split compared with LR0.01
  • LR0.1 generates more accurate sentences compared with LR0.01

Hidden Image

Resources