IBM Translational Model-1

Language: Python 3.5

Aim:

To implement the IBM Model-1 which finds the lexical translation of the words in the corpus by using the EM (Expectation Maximization) algorithm.

The data is present in data1.json, which consists of 5 Franch sentences and the corresponding English translation of each sentence. The data is present as a dictionary.
Using the data in the dictionary, all the unique French and English words were extracted into 2 lists.
The EM algorithm is run till convergence is achieved (usually by 10-20 iterations).
The alignment of the sentences is then printed after EM algorithm completes.
To run the code, simply run model1.py.

To run the following code, Anaconda needs to be readily installed.

Anaconda can be installed by following the following link: https://docs.anaconda.com/anaconda/install/