[Sentence Embedding] Improve ELMO implementation

Question

[Sentence Embedding] Improve ELMO implementation

du-phan opened this issue 6 years ago · 0 comments

Replace list computation by matrix computation.
Add TF-IDF for ELMO.
Refactor ELMO helping functions for better integration to the code base.

@RedaAffane in your get_elmo_text_batches_sif you define max_sequence_length = 100 and then use that threshold to shorten the input data. Why is that needed ? The vocabulary distribution is thus not the same anymore before and after get_elmo_text_batches_sif, and given that we compute word_weight before it, the word weights are no longer correct (?)