/word2vec

Word2Vec implementation(skip-gram model) using numpy and nltk.

Primary LanguageJupyter NotebookMIT LicenseMIT

word2vec

Word2Vec implementation in numpy. Tried out Skip-Gram model on A Storm of Swords by R.R. Martin .
Dataset Link : https://www.kaggle.com/muhammedfathi/game-of-thrones-book-files#got2.txt


Word2Vec Architecture

Dimensions of Input Layer: V X 1 (vocabulary Size)
Dimensions of W1: V X Number of Dimensions of Embedding
Dimensions of Hidden Layer 1: Number of Dimnsions of Embedding X 1
Dimensions of W2: Number of Dimensions of Embedding X V
Dimensions of Output Layer: V X 1

Word2vec Architecture

Built With

Results

Epochs : 5
Total vocabulary size : 6633 words
Number of Dimensions : 10

Output for a set of words

To-do

  • CBOW Model
  • Negative Sampling
  • Try out for more epochs and larger dimensions.

Acknowledgments