Neural Machine Translation using Seq-to-Seq with Keras

Translation from English to French using encoder-decoder model.

We have 10,000 english sentences and corresponding 10,000 translated French sentences. Approach for training is as follows :

  • Create one-hot character embeddings for English and French sentences. These will be the inputs to the encoder and the decoder. The French one-hot character embeds will also be used as target data for loss function.
  • Feed character by character embeds into the encoder till the end of the English sentence sequence.
  • Obtain the final encoder states (hidden and cell states) and feed them into the decoder as its initial state.
  • Decoder will have 3 inputs at every time step — 2 decoder states and the French character embeds fed to it character by character.
  • At every step of the decoder, the output of the decoder is sent to softmax layer that is compared with the target data.

There are major changes in the decoder network.

  • Encoder Side of network remains same as before.
  • At the first time step, the decoder has 3 inputs — the start tag ‘\t’ and the two encoder states. We input the first character as ‘\t’ ( its one hot embed vector) into the first time step of the decoder.
  • From next time step on wards the decoder still has 3 inputs but different from the first time step . They being — one hot encode of previous predicted character, previous decoder cell state and the previous decoder hidden state

Networks for Training and Inference


Training Model                                                                        



Prediction Model                                                                        
