Purpose: a encoder-decoder model with attention component is built to predict the sequence according to the input Method: (1) the encoder is a LSTM network with bidirection LSTM cells; (2) the computation loop is achieved by tf.nn.bidirectional_dynamic_rnn automatically, but in the decoder part, we can design the way of looping by combining an attention model. (3) The attention model is just a weighted sum of the input (from the output of the encoder), and the weight is determined by the prjection of the input; (4) Then this attention model is combined with a raw RNN model to design the way of looping of the decoder part; (5) The implementation including: word embedding (means to convert the word input some vectors according to the dictionary and tf.nn.embedding_lookup(embeddings, encoder_inputs) is helpful to do that); (6) assert EOS == 1 and PAD == 0, 'eos' and 'pad' are padded into the decoder input to determine the starting and ending of the sentences (7) Then minimize the loss and train the model as usual (8) Finally the predicted sequence is similar to the input sequence (9) Notice that the input is generated by a random sequence generator from helpers and our model finally can predict the sequence similar to the input from the random generator, awsome!
Wendy0601/Sequence_to_sequence_with_attention
a encoder-decoder model with attention component is built to predict the sequence according to the input
Jupyter Notebook