/word-embeddings-talk-sicsr

This repo consists of the code snippets used for the talk.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

word-embeddings-talk-sicsr

This repo consists of the code snippets used for the talk.

Link to the slide deck: https://prezi.com/view/9emf6rIvvWXcAkxb8ULO/

References:

  1. https://towardsdatascience.com/word-embedding-in-nlp-one-hot-encoding-and-skip-gram-neural-network-81b424da58f2
  2. https://machinelearningmastery.com/how-to-one-hot-encode-sequence-data-in-python/
  3. http://jalammar.github.io/illustrated-word2vec/
  4. https://gist.github.com/aparrish/2f562e3737544cf29aaf1af30362f469
  5. https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/#:~:text=Word2vec%20is%20not%20a%20single,act%20as%20word%20vector%20representations.
  6. https://towardsdatascience.com/an-implementation-guide-to-word2vec-using-numpy-and-google-sheets-13445eebd281
  7. https://towardsdatascience.com/nlp-101-negative-sampling-and-glove-936c88f3bc68
  8. https://towardsdatascience.com/word-embeddings-in-2020-review-with-code-examples-11eb39a1ee6d
  9. https://kavita-ganesan.com/gensim-word2vec-tutorial-starter-code/#.X4QYPmP7TeM
  10. https://cai.tools.sap/blog/glove-and-fasttext-two-popular-word-vector-models-in-nlp/#:~:text=Instead%20of%20learning%20vectors%20for,and%20end%20of%20the%20word.
  11. https://towardsdatascience.com/nlp-extract-contextualized-word-embeddings-from-bert-keras-tf-67ef29f60a7b
  12. https://medium.com/@dhartidhami/understanding-bert-word-embeddings-7dc4d2ea54ca
  13. https://medium.com/@_init_/why-bert-has-3-embedding-layers-and-their-implementation-details-9c261108e28a
  14. https://colab.research.google.com/drive/1ZQvuAVwA3IjybezQOXnrXMGAnMyZRuPU#scrollTo=UeQNEFbUgMSf
  15. https://huggingface.co/bert-base-uncased
  16. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
  17. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135-146.
  18. Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
  19. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  20. Cross Lingual Embeddings: https://ruder.io/cross-lingual-embeddings/