JHart96/keras_elmo_embedding_layer

about padding

Closed this issue · 3 comments

hi. greate code. but I have a question.
why are you using pre-padding while TensorFlowhub example uses post padding?
does not using pre-padding means Elmo model in tf-hub will ignore the last words?

Hi,

Thanks for your comments.

In the examples, the sequences are padded to a fixed length so that the Keras model can accept an input of fixed shape. This makes the layer a drop-in replacement for the standard Embedding layer in Keras. Internally, the ELMo embedding layer actually "unpads" the sequences and then interfaces with the Tensorflow Hub module, so the final words of the sequences are not ignored, and are treated correctly.

I hope this answers your question.

Thanks,
Jordan

Hi,
Thanks for your answer.

This line shows the length of each sentence.
sequence_lengths = tf.cast(tf.count_nonzero(x, axis=1), dtype=tf.int32)
And then in this line, you change indexes to actual words:
strings = tf.squeeze(self.lookup_table.lookup(x))
And then based on sequence_lengths, TF-hub Elmo choose the nth first words for embedding and ignore the others.
Fore example if the string is

strings = [["the", "cat", "is", "on", "the", "mat"],
["","dogs", "are", "in", "the", "fog"]]

and
sequence_lengths = [6, 5]
Does not Elmo drop the word "fog"? or there is something that i am getting wrong here?

Hi,
Sorry for the late reply! I've just realised that pad_sequences applies pre padding instead of post, this isn't what I expected, but I should've checked! Thank you for pointing this out! I've fixed this and pushed a fixed version.
Thanks,
Jordan