[Sentence Embedding] Improve ELMO implementation
du-phan opened this issue · 0 comments
du-phan commented
- Replace list computation by matrix computation.
- Add TF-IDF for ELMO.
- Refactor ELMO helping functions for better integration to the code base.
@RedaAffane in your get_elmo_text_batches_sif
you define max_sequence_length = 100
and then use that threshold to shorten the input data. Why is that needed ? The vocabulary distribution is thus not the same anymore before and after get_elmo_text_batches_sif
, and given that we compute word_weight
before it, the word weights are no longer correct (?)