Neural Machine Translator (NMT) for translating from english to hindi text. Used Pytorch framework with seq2seq architecture having Attention functionality .
The Jupyter Notebook given in this repository is self explanatory and well documented.
Pytorch == 0.3.0
Numpy == 1.14.2
This blog explains NMT really well !
There are various sources from where you can download the eng-hind.txt parallel corpus : -
The dataset file should be a tab seperated file having text in the following way -
I am cold. मुझे ठंड लग रही है।
My name is yash मेरा नाम यश है
. .
. .
The jupyter notebook given here is for educational purpose, and if you wish to see some good results then I would highly recommend you to git clone one of the following repositories -
1.Stanford NMT [Matlab]
2.tf-seq2seq [TensorFlow]
3.Nemantus [Theano]
4.OpenNMT [Torch with Lua Language]---> Highly recommended, incorporates all the functionalities
5.OpenNMT-py [PyTorch]
A Statistical Approach to Machine Translation, 1990.
Review Article: Example-based Machine Translation, 1999.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 2014.
Neural Machine Translation by Jointly Learning to Align and Translate, 2014.
Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, 2016.
Sequence to sequence learning with neural networks, 2014.
Recurrent Continuous Translation Models, 2013.
Continuous space translation models for phrase-based statistical machine translation, 2013.
A big Thank you to the whole team of Messy Fractals, especially Dhanya P and Arvind Sivdas for letting me work under them, for this project .
The credits for this code go to the user spro. I have merely made some changes in it for dealing with Hindi text.